Abstract

In Brazil, the recognition of speakers for forensic purposes still relies on a subjectivity-based decision-making process through a results analysis of untrustworthy techniques. Owing to the lack of a voice database, speaker verification is currently applied to samples specifically collected for confrontation. However, speaker comparative analysis via contested discourse requires the collection of an excessive amount of voice samples for a series of individuals. Further, the recognition system must inform who is the most compatible with the contested voice from pre-selected individuals. Accordingly, this paper proposes using a combination of linear predictive coding (LPC) and ordinary least squares (OLS) as a speaker verification tool for forensic analysis. The proposed recognition technique establishes confidence and similarity upon which to base forensic reports, indicating verification of the speaker of the contested discourse. Therefore, in this paper, an accurate, quick, alternative method to help verify the speaker is contributed. After running seven different tests, this study preliminarily achieved a hit rate of 100% considering a limited dataset (Brazilian Portuguese). Furthermore, the developed method extracts a larger number of formants, which are indispensable for statistical comparisons via OLS. The proposed framework is robust at certain levels of noise, for sentences with the suppression of word changes, and with different quality or even meaningful audio time differences.

Highlights

  • Voice recognition is a process to identify the interlocutor and/or discourse performed

  • Voice recognition can be applied in two ways: discourse recognition, and speaker recognition [2]

  • This study developed a novel method suitable for speaker verification, which is an unpublished method that takes advantage of the combination of the formants, linear predictive coding (LPC), and ordinary least squares (OLS) to generate results for decision-making in a forensic context; The robustness of the developed model is demonstrated by generating positive results, even with atypical situations such as noise, uneven speech time, quality, and textual independence; All scenarios that were preliminarily tested have indicated a 100% success rate, considering a limited dataset (Brazilian Portuguese), reducing the possibility of false positives

Read more

Summary

Introduction

Voice recognition is a process to identify the interlocutor and/or discourse performed. This study submits the discourses to formants and pitch extraction using linear predictive coding (LPC) After extraction of these speech characteristics, the technique analyses the contested audio, and performs a confrontation with pre-selected patterns, using OLS [19]. This study developed a novel method suitable for speaker verification, which is an unpublished method that takes advantage of the combination of the formants, LPC, and OLS to generate results for decision-making in a forensic context; The robustness of the developed model is demonstrated by generating positive results, even with atypical situations such as noise, uneven speech time, quality, and textual independence; All scenarios that were preliminarily tested have indicated a 100% success rate, considering a limited dataset (Brazilian Portuguese), reducing the possibility of false positives. The results are presented, followed by a comparison with other approaches, and the paper concludes by highlighting remarks on the developed approach

Fourier Transform and Windowing
Linear Predictive Coding
Least Ordinary Squares Method
Statistical Comparison Criteria
Method
Phase 1
Phase 2
Phase 3
Experimental Results
Formants
Validating the Model
Results
F10 F11 F12 F13 F14 F15 F16 F17 F18
Comparison with Other State-Of-The-Art Solutions
Present Method
Final Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call