Abstract

Several approaches to forensic speaker comparison rely on formant center‐frequency measurements as features due to their rather straightforward interpretation as resonance frequencies of the cavities of the human vocal tract. Formant tracking algorithms, mostly based on linear predictive coding, are commonly used for automatic extraction. Telephone conversations constitute a substantial amount of forensic material, which increasingly involves wireless communication channels instead of landline transmission. The effects and limitations introduced by the adaptive multirate (AMR) set of codecs that is used for speech transmission in GSM and UMTS networks are therefore of special interest in forensic settings. To evaluate the extent of the effects that are caused solely by the codecs, speech recordings were en‐ and de‐coded with the different bitrate levels provided by the AMR narrowband codec. The formant frequencies of vowel segments were extracted using different trackers and settings. The preliminary results suggest partial shifts in frequency depending on codec level and individual speakers, but no consistent trend emerges.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.