Digital watermarking based on process of speech production

Toshiyuki Sakai,Naohisa Komatsu,Edward J Delp Iii,Ping W Wong

doi:10.1117/12.526284

Abstract

A speech production procedure can be divided into three parts, namely the glottal source, articulation and radiation, respectively. We propose a watermarking method for speech by manipulating the articulation in the process of speech production. We apply our method to CS-ACELP(G.729 standard), which is the ITU-T approved recommendation. It provides a low bit rate 8 kb/s speech coding algorithm with wire/line quality. The watermarked vocal tract model is expressed by codebooks made by LSP(Line Spectrum Pair) parameters. The codebook vectors replace some of the extracted LSP. Speech is synthesized using replaced LSP. We generate a couple of codebooks using a unique method to modify the LSP of the spectrum envelope. Shortening the width of the LSPs creates one watermarked codebook, and the second codebook is created by stretching the LSP of both sides of each formant. There are ten LSP dimensions in each voice frame of the CS-ACELP decoder. In the detecting process, the weighted Euclidean distance(WED) between the watermarked codebooks and the extracted LSP will be calculated. Whether the watermark is embedded will be judged by utilizing the calculated WED. Evaluation tests on detection accuracy will be discussed with simulation results.

Full Text