Abstract
ABSTRACT Compared with humans, who have more powerful auditory ability in discriminating and identifying speakers in noisy environments, traditional forensic automatic speaker recognizers do not perform well when dealing with noisy recordings. This paper proposes a GMM-UBM Forensic Automatic Speaker Recognition (FASR) System to reduce the effect of noise on performance. The system uses Gammatone Frequency Cepstral Coefficients (GFCC) based on an auditory periphery model and also incorporates a Principal Component Analysis (PCA) algorithm. The system was tested and validated using Mandarin voice databases compromised with different levels of white noise and office noise. The performance of the system was compared with a baseline system using Mel Frequency Cepstral Coefficients (MFCC) and also PCA under the same conditions. The results show that the performance of the combined GFCC system achieved a substantial improvement when compared with the baseline MFCC system under conditions of a high level of office noise.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.