The Representation of Speech and Its Processing in the Human Brain and Deep Neural Networks

Odette Scharenborg

doi:10.1007/978-3-030-26061-3_1

Abstract

For most languages in the world and for speech that deviates from the standard pronunciation, not enough (annotated) speech data is available to train an automatic speech recognition (ASR) system. Moreover, human intervention is needed to adapt an ASR system to a new language or type of speech. Human listeners, on the other hand, are able to quickly adapt to nonstandard speech and can learn the sound categories of a new language without having been explicitly taught to do so. In this paper, I will present comparisons between human speech processing and deep neural network (DNN)-based ASR and will argue that the cross-fertilisation of the two research fields can provide valuable information for the development of ASR systems that can flexibly adapt to any type of speech in any language. Specifically, I present results of several experiments carried out on both human listeners and DNN-based ASR systems on the representation of speech and lexically-guided perceptual learning, i.e., the ability to adapt a sound category on the basis of new incoming information resulting in improved processing of subsequent speech. The results showed that DNNs appear to learn structures that humans use to process speech without being explicitly trained to do so, and that, similar to humans, DNN systems learn speaker-adapted phone category boundaries from a few labelled examples. These results are the first steps towards building human-speech processing inspired ASR systems that, similar to human listeners, can adjust flexibly and fast to all kinds of new speech.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Representation of Speech and Its Processing in the Human Brain and Deep Neural Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling
G Thimmaraja Yadava ... H S Jayanna
International Journal of Speech Technology | VOL. 23
G Thimmaraja Yadava, et. al.G Thimmaraja Yadava ... H S Jayanna
22 Jan 2020
International Journal of Speech Technology | VOL. 23

Performance Analysis of various Front-end and Back End Amalgamations for Noise-robust DNN-based ASR
Mohit Dua ... Vinam Agrawal
Recent Advances in Computer Science and Communications | VOL. 14
Mohit Dua, et. al.Mohit Dua ... Vinam Agrawal
01 Dec 2021
Recent Advances in Computer Science and Communications | VOL. 14

Native Language Identification from Spoken Indian English
...
Trends in Electrical Engineering | VOL. 9
, et. al. ...
30 Oct 2019
Trends in Electrical Engineering | VOL. 9

Explicit Pitch Mapping for Improved Children’s Speech Recognition
Hemant Kumar Kathania ... S Shahnawazuddin
Circuits, Systems, and Signal Processing | VOL. 37
Hemant Kumar Kathania, et. al.Hemant Kumar Kathania ... S Shahnawazuddin
11 Sep 2017
Circuits, Systems, and Signal Processing | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Representation of Speech and Its Processing in the Human Brain and Deep Neural Networks

Abstract

Talk to us

Similar Papers