Towards blind detection of steganography in low‐bit‐rate speech streams

Congcong Sun,Yiqiao Cai,Chin‐Chen Chang,Wojciech Mazurczyk,Hui Tian,Yonghong Chen

doi:10.1002/int.23077

Abstract

To prevent the abuse of low-rate speech-based steganography from threatening cyberspace security, the corresponding steganalysis approaches have been developed and received significant attention from research community. However, most existing steganalysis methods assume that steganography methods are known in advance, which in practice is impractical. That is why, in this paper, we present three blind detection schemes suitable for steganography in low-bit-rate speech streams. The first is based on mixed sample data augmentation. It randomly selects a certain proportion of steganographic samples from the sample set of each steganographic method to form a training set together with the original carrier samples for training to enhance the robustness of the model. The second relies on decision fusion where first step is to train a dedicated classification model for each steganography method and then use a majority voting mechanism in the detection stage to fuse the outputs of each model to give the final detection result. Compared to the other two steganalysis schemes, the third one design the detection model based on self-paced ensemble according to the distribution characteristics of speech samples. Its main idea is to fully train multiple base classifiers through multiple iterations as well as under-sampling processes, and organically fuse them to form a powerful ensemble classifier. In each iteration, differing from the traditional ensemble classifier solution, we put more attention to the steganographic samples at the decision boundary for the under-sampling process of the steganography set composed of multiple steganography methods, rather than randomly selecting steganographic samples. The steganographic samples at the decision boundary are searched using the classification hardness given by the ensemble classifier trained in the last iteration, which is more informative and more conducive to improve the performance of base classifiers. The experimental results show that the proposed three schemes can achieve efficient blind detection for low-bit-rate speech-based steganography, and the steganalysis scheme based on the self-paced ensemble has the best performance. Specifically, when the embedding rate is at 30%, the accuracy of the steganalysis scheme based on self-paced ensemble is more than 85%, while the accuracy of the other two steganalysis method is less than 80%. Additionally, the steganalysis scheme based on the self-paced ensemble learning even outperforms dedicated detectors for specific steganographic methods in terms of recall for steganographic sample detection.

Full Text