Abstract

We present an automated method for estimating the difficulty of spoken texts for use in generating items that assess non-native learners’ listening proficiency. We collected information on the perceived difficulty of listening to various English monologue speech samples using a Likert-scale questionnaire distributed to 15 non-native English learners. We averaged the overall rating provided by three nonnative learners at different proficiency levels into an overall score of listenability. We then trained a multiple linear regression model with the listenability score as the dependent variable and features from both natural language and speech processing as the independent variables. Our method demonstrated a correlation of 0.76 with the listenability score, comparable to the agreement between the nonnative learners’ ratings and the listenability score.

Highlights

  • Extensive research has been conducted on the prediction of difficulty of understanding written language based on linguistic features

  • Our study explores a systematic way to measure the difficulty of spoken texts using natural language processing (NLP) technology

  • The second bias may be more salient for participants at low proficiency levels and cause a skew toward higher listenability scores

Read more

Summary

Introduction

Extensive research has been conducted on the prediction of difficulty of understanding written language based on linguistic features. This has resulted in various readability formulas, such as the Fry readability index and the Flesch-Kincaid formula, which is scaled to United States primary school grade levels. Linguistic features unique to spoken language such as speech rate, disfluency features, and phonological phenomena contribute to the processing difficulty of spoken texts as such linguistic features pose challenges at both perception (or parsing) and comprehension levels (Anderson, 2005). Research evidence indicated that ESL students performed better on listening comprehension tasks when the rate of speech was slowed and meaningful pauses were included (Blau, 1990; Brindley and Slatyer, 2002). Disfluency features (e.g., pause, repetitions) may provide the listener with more processing time and redundant in-

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call