Improved Speaker Recognition for Degraded Human Voice using Modified-MFCC and LPC with CNN

Amit Moondra,Poonam Chahal

doi:10.14569/ijacsa.2023.0140416

Amit Moondra, Poonam Chahal

Open Access

PDF Available

https://doi.org/10.14569/ijacsa.2023.0140416

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Economical speaker recognition solution from degraded human voice signal is still a challenge. This article is covering results of an experiment which targets to improve feature extraction method for effective speaker identification from degraded human audio signal with the help of data science. Every speaker’s audio has identical characteristics. Human ears can easily identify these different audio characteristics and classify speaker from speaker’s audio. Mel-Frequency Cepstral Coefficient (MFCC) supports to get same intelligence in machine also. MFCC is extensively used for human voice feature extraction. In our experiment we have effectively used MFCC and Linear Predictive Coding (LPC) for better speaker recognition accuracy. MFCC first outlines frames and then finds cepstral coefficient for each frame. MFCC use human audio signal and convert it in numerical value of audio features, which is used to recognize speaker efficiently by Artificial Intelligence (AI) based speaker recognition system. This article covers how effectively audio features can be extracted from degraded human voice signal. In our experiment we have observed improved Equal Error Rate (EER) and True Match Rate (TMR) due to high sampling rate and low frequency range for mel-scale triangular filter. This article also covers pre-emphasis effects on speaker recognition when high background noise comes with audio signal.

Full Text