Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model

Xuan Shi,Tiantian Feng,Kevin Huang,Sudarsana Reddy Kadiri,Jihwan Lee,Yijing Lu,Yubin Zhang,Louis Goldstein,Shrikanth Narayanan

doi:10.1121/10.0034430

Direct articulatory observation reveals phoneme recognition performance characteristics of a self-supervised speech model

Xuan Shi, Tiantian Feng + Show 7 more

https://doi.org/10.1121/10.0034430

Copy DOI

Journal: JASA Express Letters

Publication Date: Nov 1, 2024

#Phoneme Recognition #Indian English Speakers + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme recognition using direct articulatory evidence. Findings indicate significant differences in phoneme recognition, especially in front vowels, between American English and Indian English speakers. To gain a deeper understanding of these differences, we conduct real-time MRI-based articulatory analysis, revealing distinct velar region patterns during the production of specific front vowels. This underscores the need to deepen the scientific understanding of self-supervised speech model variances to advance robust and inclusive speech technology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: JASA Express Letters

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.