Combining spectral features of standard and Throat Microphones for speaker identification

Nafeesa Mubeen,A Nayeemulla Khan,G Vinoth,A Shahina

doi:10.1109/icrtit.2012.6206769

Abstract

The objective of this paper is to improve the performance of the speaker recognition system by combining speaker specific evidences present in the spectral characteristics of the standard microphone speech and the throat microphone speech. Certain vocal tract spectral features extracted from these two speech signals are distinct and could be complimentary to one another. These features could also be speech specific as well as speaker specific. These distinguishing and complimentary nature of the spectral features are due to the difference in the placement of the two microphones. Auto associative neural networks are used to model the speaker characteristics based on the system features represented by weighted linear prediction cepstral coefficients. The speaker recognition system based on Throat Microphone (TM) spectral features is comparable (though slightly less accurate) to that based on standard (or Normal) Microphone (NM) features. By combining the evidence from both the NM and TM based systems using late integration, an improvement in performance is observed from about 91% (obtained using NM features alone) to 94% (NM and TM combined). This shows the potential of combining various other speaker specific characteristics of the NM and two speech signals for further improvement in performance.

Full Text