Generalized Structure of Active Speech Perception Based on Multiagent Intelligence

Zalimkhan Nagoev,Murat Anchekov,Irina Gurtueva

doi:10.1007/978-3-030-96993-6_35

Abstract

Recent success in the field of speech technology is undoubted. Developers from Microsoft and IBM reported on the efficiency of automated speech recognition systems at the human level in transcribing conversational telephone speech. According to various estimates, their WER now is about 5.8–5.1%. However, the most challenging problems in speech recognition – diarization and noise cancellation – are still open. A comparative analysis of the most frequent errors made by systems and people when solving the recognition problem shows that, in general, the errors are similar. Errors made by a human when solving speech recognition problems are much less critical; they seldom distort the meaning of a statement. In other words, these errors are not sematic. That is why the mechanisms of human speech perception are the most promising area of research. This paper proposes the model of a general structure for active auditory perception theory and the neurobiological basis of the hypothesis put forward. The proposed concept is a basic platform for general multiagent architecture. We assume that speech recognition is guided by attention, even in its early stages, a change in the early auditory code determined by context and experience. This model simulates the involuntary attention used by children in mastering their native language, based on an emotional assessment of perceptually significant auditory information. The multiagent internal dynamics of auditory speech coding can provide new insights into how hearing impairment can be treated. The formal description of the structure of speech perception can be used as a general theoretical basis for the development of universal systems for automatic speech recognition, highly effective in noisy conditions and cocktail-party situations. Formal means for program implementation of the present model are multiagent systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Generalized Structure of Active Speech Perception Based on Multiagent Intelligence

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Using Auxiliary Sources of Knowledge for Automatic Speech Recognition

-

01 Jan 2004
01 Jan 2004

The Benefit Obtained from Visually Displayed Text from an Automatic Speech Recognizer During Listening to Speech Presented in Noise
Adriana A Zekveld ... Marcel S M G Vlaming
Ear & Hearing | VOL. 29
Adriana A Zekveld, et. al.Adriana A Zekveld ... Marcel S M G Vlaming
01 Dec 2008
Ear & Hearing | VOL. 29

Performance Analysis of various Front-end and Back End Amalgamations for Noise-robust DNN-based ASR
Mohit Dua ... Vinam Agrawal
Recent Advances in Computer Science and Communications | VOL. 14
Mohit Dua, et. al.Mohit Dua ... Vinam Agrawal
01 Dec 2021
Recent Advances in Computer Science and Communications | VOL. 14

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Generalized Structure of Active Speech Perception Based on Multiagent Intelligence

Abstract

Talk to us

Similar Papers