Abstract
The problem of real-time automatic speech recognition in an adverse environment is addressed. Though much research has been performed in the area of speech recognition, only limited success has been demonstrated for real-time recognition in noisy stressful environments. The primary reason for this is that the performance of present day recognition algorithms are predicated on the assumptions of the environmental settings in which the algorithms have been formulated and implemented. In this paper, we discuss the effects of additive background noise on speech quality and recognition parameters, and propose a source generator based framework to address stress and noise. Using this framework, a computationally efficient real-time recognition system called ICARUS is developed. The speech recognition system incorporates direct processing steps to address the effects of additive noise on the speech signal and stress on the speech production system. Central issues which are addressed include (i) improved characterization of speech spoken in noisy situations involving both parameter estimation methods and analysis of varying speech characteristics spoken in adverse environments (i.e., stress and Lombard effect), (ii) exploration of signal processing strategies tailored to such speech, and (iii) demonstration of real-time system performance of the proposed methods. The proposed recognition system was formulated using a digital signal processing platform. Performance evaluations showed an improvement in speech feature representation under stressed speaking conditions, with an average improvement in recognition rate of +17.28% across eleven noisy stressful speaking conditions.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have