Abstract
This study investigates differences in sentence and story production between native and non-native speakers of English for use with a system of Automatic Speech Recognition (ASR). Previous studies have shown that production errors by non-native speakers of English include misproduced segments (Flege, 1995), longer pause duration (Anderson-Hsieh and Venkatagiri, 1994), abnormal pause location within clauses (Kang, 2010), and non-reduction of function words (Jang, 2009). The present study uses phonemically balanced sentences from TIMIT (Garofolo et al., 1993) and a story to provide an additional comparison of the differences in production by native and non-native speakers of English. Consistent with previous research, preliminary results suggest that non-native speakers of English fail to produce flaps and reduced vowels, insert or delete segments, engage in more self-correction, and place pauses in different locations from native speakers. Non-native English speakers furthermore produce different patterns of intonation from native speakers and produce errors indicative of transfer from their L1 phonology, such as coda deletion and vowel epenthesis. Native speaker productions also contained errors, the majority of which were content-related. These results indicate that difficulties posed by English ASR systems in recognizing non-native speech are due largely to the heterogeneity of non-native production.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.