Successes and critical failures of neural networks in capturing human-like speech recognition

Federico Adolfi,Jeffrey S Bowers,David Poeppel

doi:10.1016/j.neunet.2023.02.032

Federico Adolfi, Jeffrey S Bowers + Show 1 more

Open Access

https://doi.org/10.1016/j.neunet.2023.02.032

Copy DOI

Journal: Neural Networks	Publication Date: Feb 24, 2023
Citations: 12	License type: cc-by

Affiliation: University of Bristol, New York University

Abstract

Natural and artificial audition can in principle acquire different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would potentially enrich artificial hearing systems and process models of the mind and brain. Speech recognition — an area ripe for such exploration — is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting alternative directions for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Successes and critical failures of neural networks in capturing human-like speech recognition

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Similar Papers

Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features
Petar S Aleksic ... Zhilin Wu
EURASIP Journal on Advances in Signal Processing | VOL. 2002
Petar S Aleksic, et. al.Petar S Aleksic ... Zhilin Wu
28 Nov 2002
EURASIP Journal on Advances in Signal Processing | VOL. 2002

Data-driven production models for speech processing

-

01 Jan 1998
01 Jan 1998

Recognition of overlapping speech using digital MEMS microphone arrays
Erich Zwyssig ... Mike Lincoln
-
Erich Zwyssig, et. al.Erich Zwyssig ... Mike Lincoln
01 May 2013
01 May 2013

Temporal AM–FM combination for robust speech recognition
Yotaro Kubo ... Katsuhiko Shirai
Speech Communication | VOL. 53
Yotaro Kubo, et. al.Yotaro Kubo ... Katsuhiko Shirai
01 Sep 2010
Speech Communication | VOL. 53

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Successes and critical failures of neural networks in capturing human-like speech recognition

Abstract

Talk to us

Similar Papers

More From: Neural Networks