Abstract

Deep neural networks (DNNs) have become the dominant technique for acoustic-phonetic modeling due to their markedly improved performance over other models. Despite this, little is understood about the computation they implement in creating phonemic categories from highly variable acoustic signals. In this paper, we analyzed a DNN trained for phoneme recognition and characterized its representational properties, both at the single node and population level in each layer. At the single node level, we found strong selectivity to distinct phonetic features in all layers. Node selectivity to specific manners and places of articulation appeared from the first hidden layer and became more explicit in deeper layers. Furthermore, we found that nodes with similar phonetic feature selectivity were differentially activated to different exemplars of these features. Thus, each node becomes tuned to a particular acoustic manifestation of the same feature, providing an effective representational basis for the formation of invariant phonemic categories. This study reveals that phonetic features organize the activations in different layers of a DNN, a result that mirrors the recent findings of feature encoding in the human auditory system. These insights may provide better understanding of the limitations of current models, leading to new strategies to improve their performance. Index Terms: Deep neural networks, deep learning, automatic speech recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.