CompSpoof Dataset

Introduction

The CompSpoof dataset is designed for studying component-level anti-spoofing, where either the speech or the environmental sound component (or both) may be spoofed.

📄 Paper on arXiv

🖥️ Code on github

📢 NEWS!!

We expanded CompSpoof dataset to CompSpoofV2, which significantly expands the diversity of attack sources, environmental sounds, and mixing strategies. ✨ In addition, newly generated audio samples are distributed across the test set and are specifically designed to serve as detection data under unseen conditions.

🤗 CompSpoofV2 Details & Download Link: https://xuepingzhang.github.io/CompSpoof-V2-Dataset/

Building upon CompSpoofV2 dataset and separation-enhanced joint learning framework, we lunched the ICME 2026 Environment-Aware Speech and Sound Deepfake Detection Challenge (ESDD2). We warmly invite researchers from both academia and industry to participate in this challenge, exploring robust and effective solutions for these critical deepfake detection tasks.

🖥️ ESDD2 Challenge website: https://sites.google.com/view/esdd-challenge/esdd-challenges/esdd-2/description

📥 Download

You can download the dataset on hugging face:
🤗 CompSpoof Download Link


🎧 Audio Examples

Below are audio samples from the CompSpoof dataset. For each class, we provide the mixed/original audio, along with the speech and environment sources.


Class 0 — Original

Label: original

Description: Original bona fide speech and corresponding environment audio without mixing

Original

Class 1 — Bona fide + Bona fide

Label: bonafide_bonafide

Description: Bona fide speech mixed with another bona fide environmental audio

Mixed Speech Environment

Class 2 — Spoofed Speech + Bona fide Environment

Label: spoof_bonafide

Description: Spoof speech mixed with bona fide environmental audio

Mixed Speech Environment

Class 3 — Bona fide Speech + Spoofed Environment

Label: bonafide_spoof

Description: Bona fide speech mixed with spoof environmental audio

Mixed Speech Environment

Class 4 — Spoofed Speech + Spoofed Environment

Label: spoof_spoof

Description: Spoof speech mixed with spoof environmental audio

Mixed Speech Environment

📂 Dataset Overview

ID Mixed Speech Environment Class Label Description
0 Bona fide Bona fide original Original bona fide speech and corresponding environment audio without mixing
1 Bona fide Bona fide bonafide_bonafide Bona fide speech mixed with another bona fide environmental audio
2 Spoofed Bona fide spoof_bonafide Spoof speech mixed with bona fide environmental audio
3 Bona fide Spoofed bonafide_spoof Bona fide speech mixed with spoof environmental audio
4 Spoofed Spoofed spoof_spoof Spoof speech mixed with spoof environmental audio

🗂️ Metadata

The dataset includes three metadata files: CompSpoof_train.txt, CompSpoof_dev.txt, and CompSpoof_eval.txt.

Each line has four fields:

mixed_audio   speech_source   env_source   class_label

🎧 Data Sources

Environmental sounds cover indoor, street, and natural settings, ensuring acoustic diversity.

During processing:


🔖 Citation

If you use this dataset in your research, please cite:

@misc{zhang2025compspoofdatasetjointlearning,
      title={CompSpoof: A Dataset and Joint Learning Framework for Component-Level Audio Anti-spoofing Countermeasures}, 
      author={Xueping Zhang and Liwei Jin and Yechen Wang and Linxi Li and Ming Li},
      year={2025},
      eprint={2509.15804},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2509.15804}, 
}

License

This dataset is a derived dataset constructed by combining and mixing audio samples from multiple publicly available datasets.

Users must comply with the license terms of each original dataset. The authors do not claim ownership of the original audio content. Due to the inclusion of datasets licensed under CC BY-NC 4.0 license, this dataset is released under the CC BY-NC 4.0 license.