Abstract
The human ability to recognize speech drastically outperforms that of commercial ASR systems especially in noisy environments. Presently, there is limited knowledge of the auditory system dynamics, however it is known that coding and processing of information is carried out via action potentials. This research aims to better understand the coding mechanisms along the auditory pathway, while devising a noise robust system for speech recognition. A biologically plausible algorithm for vowel classification is proposed, which solely uses spikes for both the feature extraction and the classification stages. The algorithm uses an improved and adaptive model of the inner-hair cell [Sumner et al., J. Acoust. Soc. Am. 113, 893–901 (2003)] to generate spike trains at different characteristic frequencies. The synchrony among the hair cells is used as a noise robust means for feature extraction. Detected features are then classified using a spike-based rank order coder, which uses the spike arrival times to the postsynaptic neuron to encode information [Delorme and Thorpe, Neural Networks. 14, 795–803 (2001)]. Experiments on a noisy vowel dataset (5 dB SNR) show an average of 15% increase in the recognition rate for the prototype system when compared to a nearest-neighbor classifier employing Mel frequency cepstral coefficients.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.