Abstract
Perceptual speech hash and robust watermarking have been widely investigated to solve the problems of authenticating speech integrity. The former generates a watermark and the latter embeds the watermark into the speech signal to implement speech integrity authentication. In this paper, we propose a perceptual speech hash algorithm and a robust watermarking algorithm for speech integrity authentication. To obtain perceptual speech hash values, we propose a gammatone filter model of the speech signal to extract sensitive auditory features (denoted by gammatone features). A random Gaussian matrix is used to reduce the dimensionality of the features of the gammatone to generate perceptual speech hash values. For the watermarking algorithm, we construct learned dictionaries to obtain the robust sparse feature of coefficients of the stationary wavelet transforms, and embed a watermark (perceptual speech hash values) into the sparse feature by patchwork and quantization index modulation. We illustrate the good imperceptibility of the authentication scheme in terms of the signal-to-noise ratio, objective difference grade, and subjective difference grade, and verify its robustness against common signal processing operations while maintaining imperceptibility. Moreover, our proposed method is sensitive to the malicious modification of the watermarked speech. Compared with state-of-the-art algorithms, the proposed algorithm can obtain better comprehensive performance in the detection and localization of tampering with the content of speech.
Highlights
With the rapid development of the Internet, digital multimedia materials can be transmitted efficiently
The parameters were set as follows: The host speech signals were divided into 25 frames (N = 25), and every frame was further divided into eight segments (M = 8)
As a tool for speech integrity authentication, the proposed scheme is tolerant to common signal processing manipulations, such as MP3 compression, adding noise, filtering, and re-sampling
Summary
With the rapid development of the Internet, digital multimedia materials can be transmitted efficiently. They can be copied or modified without permission. As processing software and tools have become widely accessible, tampering digital multimedia data has become increasingly easy. Adversaries can carry out various kinds of tampering operations on digital multimedia content−for example, inserting, deleting, and replacing−that can have significant social and economic repercussions. Speech signals are important multimedia signals, and include. The associate editor coordinating the review of this manuscript and approving it for publication was SK Hafizul Islam. Military commands, recorded evidence in courts, and online speech orders. Protecting the integrity of digital speech content is important in the field of information security. The human ear can detect even a small change in auditory speech signals. In comparison with digital image integrity authentication, less attention has been paid to speech integrity authentication because it is more challenging
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.