Determining whether an audio signal is single compressed (SC) or double compressed (DC) is a crucial task in audio forensics, as it is closely linked to the integrity of the recording. In this paper, we propose the utilization of phase spectrum-based features for detecting DC narrowband and wideband adaptive multi-rate (AMR-NB and AMR-WB) speech. To the best of our knowledge, phase spectrum features have not been previously explored for DC audio detection. In addition to introducing phase spectrum features, we propose a novel parallel LSTM system that simultaneously learns the most representative features from both the magnitude and phase spectrum of the speech signal and integrates both sets of information to further enhance its performance. Analyses demonstrate significant differences between the phase spectra of SC and DC speech signals, suggesting their potential as representative features for DC AMR speech detection. The proposed phase spectrum features are found to perform as well as magnitude spectrum features for the AMR-NB codec, while outperforming the magnitude spectrum in detecting AMR-WB speech. The proposed phase spectrum features yield 8% performance improvement in terms of true positive rate over the magnitude spectrogram features. The proposed parallel LSTM system further improves DC AMR-WB speech detection.
Read full abstract