Abstract

A great interest in the research of audio-visual speech recognition (AVSR) systems is driven by the increase in the number of multimedia applications that require robust speech recognition systems. The use of visual features in AVSR is justified by both the audio and visual modality of the speech generation and the need for features that are invariant to acoustic noise perturbation. The performance of the AVSR system relies on a robust set of visual features obtained from the accurate detection and tracking of the mouth region. Therefore the mouth tracking plays a major role in AVSR systems. This paper presents an improvement version of mouth tracking technique using radial basis function neural network (RBF NN) with its applications to AVSR systems. A modified extended Kalman filter (EKF) is used to adjust the parameters of the RBF NN. Simulation results have revealed good performance of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.