Abstract

Audio source recording device recognition is a critical digital forensic task that involves identifying the source device based on intrinsic audio characteristics. This technology finds widespread application in various digital audio forensic scenarios, including audio source forensics, tamper detection forensics, and copyright protection forensics. However, existing methods often suffer from low accuracy due to limited information utilization. In this study, we propose a novel method for source recording device recognition, grounded in feature representation learning. Our approach aims to overcome the limitations of current methods. We introduce a temporal audio feature called the “Sequential Gaussian Mean Matrix (SGMM),” which is derived from temporal segmented acoustic features. We then design a structured representation learning model that combines Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory Networks (BiLSTM). This model leverages temporal Gaussian representation and convolutional bottleneck representation to effectively condense spatial information and achieve accurate recognition through temporal modeling. Our experimental results demonstrate an impressive recognition accuracy of 98.78%, showcasing the effectiveness of our method in identifying multiple classes of recording devices. Importantly, our approach outperforms state-of-the-art methods in terms of recognition performance. Our implementing code is publicly available at https://github.com/CCNUZFW/SGMM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call