MultiAPI Spoof: Multi-Source Audio Anti-Spoofing Dataset

Introduction

MultiAPI-Spoof is a multi-source audio anti-spoofing dataset that contains approximately 230 hours of spoofed audio. It includes synthetic audio generated by commercial TTS services, open-source models, and Chinese TTS websites. An equal amount of bonafide speech from CommonVoice is included for a 1:1 balance between genuine and spoofed samples. This dataset is designed to support research and model training for audio anti-spoofing.


Spoofed Audio Samples

Audio samples from Commercial TTS APIs

Audio samples from Open-Source Models

Audio samples from TTS Websites

Spoofed Audio Data Sources

Our new dataset, MultiAPI Spoof, contains speech samples generated from a variety of API sources, including:

  1. Commercial TTS APIs – speech generated by commercial services.
  2. Open-Source Model Generation – speech generated by open-source models.
  3. TTS Websites – speech on TTS web platforms.

The dataset is organized into 30 API, labeled A0–A29, with each group corresponding to a unique speech generation API source. The duration of speech in each API ranges from 0.2 to 12 hours.


Dataset Split

API NO. train dev eval
A0-A20 70% 10% 20%
A21-A23 / 100% /
A24-A29 / / 100%

Metadata

The dataset includes three metadata files: MultiAPI_train.txt, MultiAPI_dev.txt, and MultiAPI_eval.txt.

Each line has four fields:

audio_path     api     class_label
XXX.mp3        A0      spoofed
XXX.mp3        -       bonafide ---

đź“– Citation

If you use this code or dataset, please cite:

@misc{zhang2025multiapispoofmultiapidataset,
      title={MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection}, 
      author={Xueping Zhang and Zhenshan Zhang and Yechen Wang and Linxi Li and Liwei Jin and Ming Li},
      year={2025},
      eprint={2512.07352},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2512.07352}, 
}

Contact

For questions or suggestions, please contact: xueping.zhang@dukekunshan.edu.cn