Abstract

This paper proposes a method to enhance speaker recognition at varying distances by adjusting the reference voice based on voice features. Speaker recognition is the process of identifying an individual based on their voice. It involves analyzing and comparing various acoustic features of a person's voice with the reference voice in the database. Conventional speaker recognition techniques have limitations of reduced accuracy when speakers are from varying distances. In this work, we found that high-frequency signals tend to decline faster than low-frequency ones with respect to speaker distance. Based on this, we propose a method that utilizes support vector machines (SVM) to classify speaker distance using sound features, such as the amplitude sum of high-frequency signals and the dynamic range. Once the speaker distance is determined, the reference signal in the database is adjusted according to the distance before being used for speaker recognition. Mel-Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) were employed as the recognition algorithm. Experiments were conducted with speakers placed at three distances, 0.1, 1, and 2.5 meters from the microphone. The experimental results reveal that signals with the frequency of 4 kHz and above experience a faster decline in amplitude than lower ones with increasing distance. The recognition results also demonstrate a significant improvement in accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call