Abstract
The aim of the present study is to investigate the performance of automatic perceptual judgment models built with neural networks. In previous studies, Franco et al. (1997) used HMM-derived scores based on posterior probabilities of phone segments, demonstrating a high correlation with human raters. Deville et al. (1999) also used an HMM/ANN recognition approach, and showed how the results of automatic speech recognition can be used for perceptual judgments. However, since most previous studies made use of automatic speech recognition in their analysis, the present study provides a different approach: using features and raw data. Native speakers of English will listen to English sentences produced by native and non-native speakers of English, transcribe what they heard, and respond to one of three perceptual judgements: foreign-accentedness, fluency, and comprehensibility. The data will be fed into prediction models in three different ways; one with annotated features (pauses, durations, etc), another with Mel Frequency Cepstral Coefficients (MFCC), and the other with Mel-spectrograms. The performance of the models will be measured by analyzing the correlation between the judgments by models and by human raters. The preliminary results of this study will be used to build more accurate automatic proficiency judgment models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.