Методика создания многомодального корпуса для аудиовизуального распознавания речи в ассистивных транспортных системах

A.A Axyonov,I.B Lashkov,D.V Ivanko,A.A Karpov,A.M Kashevnik,D.A Ryumin

doi:10.34219/2078-8320-2020-11-5-87-93

Методика создания многомодального корпуса для аудиовизуального распознавания речи в ассистивных транспортных системах

A.A Axyonov, I.B Lashkov + Show 4 more

https://doi.org/10.34219/2078-8320-2020-11-5-87-93

Copy DOI

Journal: Informatization and communication

Publication Date: Dec 1, 2020

#Driver Monitoring System #Multimodal Speech Recognition + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper introduces a new methodology of multimodal corpus creation for audio-visual speech recognition in driver monitoring systems. Multimodal speech recognition allows using audio data when video data are useless (e.g. at nighttime), as well as applying video data in acoustically noisy conditions (e.g., at highways). The article discusses several basic scenarios when speech recognition in the vehicle environment is required to interact with the driver monitoring system. The methodology defi nes the main stages and requirements for the design of a multimodal building. The paper also describes metaparameters that the multimodal corpus must correspond to. In addition, a software package for recording an audiovisual speech corpus is described.

Full Text