Abstract

Abstract The widespread use of automatic speaker recognition technology in real world applications demands for robustness against various realistic conditions. In this paper, a robust spectral feature set, called NDSF (Normalized Dynamic Spectral Features) is proposed for automatic speaker recognition in mismatch condition. Magnitude spectral subtraction is performed on spectral features for compensation against additive noise. A spectral domain modification is further performed using time-difference approach followed by Gaussianization Non-linearity. Histogram normalization is applied to these dynamic spectral features, to compensate the effect of channel mismatch and some non-linear effects introduced due to handset transducers. Feature extraction using proposed features is carried out for a text independent automatic speaker recognition (identification) system. The performance of proposed feature set is compared with conventional cepstral features like (mel-frequency cepstral coefficients and linear prediction cepstral coefficients), for acoustic mismatch condition caused by use of different sensors. Studies are performed on two databases: A multi-variability speaker recognition (MVSR) developed by IIT-Guwahati and Multi-speaker continuous (Hindi) speech database (By Department of Information Technology, Government of India). From experimental analysis, it is observed that, spectral domain dynamic features enhance the robustness by reducing additive noise and channel effects caused by sensor mismatch. The proposed NDSF features are found to be more robust than cepstral features for both datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call