Spectrogram Dataset of Korean Smartphone Audio Files Forged Using the “Mix Paste” Command

Yeongmin Son,Won Jun Kwak,Jae Wan Park

doi:10.3390/data8120183

Yeongmin Son, Won Jun Kwak + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/data8120183

Copy DOI

Export

Save

Cite

Journal: Data	Publication Date: Dec 1, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Soongsil University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This study focuses on the field of voice forgery detection, which is increasing in importance owing to the introduction of advanced voice editing technologies and the proliferation of smartphones. This study introduces a unique dataset that was built specifically to identify forgeries created using the “Mix Paste” technique. This editing technique can overlay audio segments from similar or different environments without creating a new timeframe, making it nearly infeasible to detect forgeries using traditional methods. The dataset consists of 4665 and 45,672 spectrogram images from 1555 original audio files and 15,224 forged audio files, respectively. The original audio was recorded using iPhone and Samsung Galaxy smartphones to ensure a realistic sampling environment. The forged files were created from these recordings and subsequently converted into spectrograms. The dataset also provided the metadata of the original voice files, offering additional context and information that could be used for analysis and detection. This dataset not only fills a gap in existing research but also provides valuable support for developing more efficient deep learning models for voice forgery detection. By addressing the “Mix Paste” technique, the dataset caters to a critical need in voice authentication and forensics, potentially contributing to enhancing security in society.

Full Text