Abstract
The problem of identifying speakers from voice analysis is a serious one. Many procedures have been proposed, some based on signal processing techniques common to automatic speech recognition. Yet it is clear that humans very often can make highly accurate identifications, even under challenging listening conditions that are common in forensic audio. A number of procedures have been developed which mimic human perception for this purpose: a semi-automatic forensic speaker recognition system using four sets of parameters, or vectors, based on a substantial number of related speech parameters. Identifications of 28 males in a field of 10 foil voices provided these data; the technique involved three (complete) replications of the approach. It was found that identification scores for the three of these vectors (voice quality, vowel formants, fundamental frequency) were very high and that for the temporal vector, positive but modest. Moreover, it also was found that every one of the vector-summation scores identified the target speaker. These results were based on high quality simulated field recordings, and demonstrate the efficacy of modeling biological systems (human perception) to solve challenging processing problems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.