Simultaneous speaker identification and watermarking

Basant S Abd El-Wahab,Heba A El-Khobby,Fathi E Abd El-Samie,Mustafa M Abd Elnaby

doi:10.1007/s10772-019-09658-x

Basant S Abd El-Wahab, Heba A El-Khobby + Show 2 more

https://doi.org/10.1007/s10772-019-09658-x

Copy DOI

Abstract

Biometric template protection of speech signals and information hiding in speech signals are two challenging issues. To resolve such limitations and increase the level of security, our objective is to build multi-level security systems based on speech signals. So, speech watermarking is used simultaneously with automatic speaker identification. The speech watermarking is performed to embed images into the speech signals that are used for speaker identification. The watermark is extracted for authentication, and then the effect of watermark removal on the performance of the speaker identification system in the presence of degradations is studied. This paper presents an approach for speech watermarking based on empirical mode decomposition (EMD) in different transform domains and singular value decomposition (SVD). The speech signal is decomposed in different transform domains with EMD to yield zero-mean components called intrinsic mode functions (IMFs). The watermark is inserted into one of these IMF components with SVD. A comparison between different transform domains for implementing the proposed watermarking scheme on different IMFs is presented. The log-likelihood ratio (LLR), correlation coefficient (Cr), signal-to-noise ratio (SNR), and spectral distortion (SD) are used as metrics for the comparison. According to the simulation results, we find that the watermark embedding in the discrete sine transform domain provides higher SNR and Cr values and lower SD and LLR values. The proposed approach is robust to different attacks.

Full Text