Abstract
With increasing global demand for learning English as a second language, there has been considerable interest in methods of automatic assessment of spoken language proficiency for use in interactive electronic learning tools as well as for grading candidates for formal qualifications. This paper presents an automatic system to address the assessment of spontaneous spoken language. Prompts or questions requiring spontaneous speech responses elicit more natural speech which better reflects a learner’s proficiency level than read speech. In addition to the challenges of highly variable non-native, learner, speech and noisy real-world recording conditions, this requires any automatic system to handle disfluent, non-grammatical, spontaneous speech with the underlying text unknown. To handle these, a strong deep learning based speech recognition system is applied in combination with a Gaussian Process (GP) grader. A range of features derived from the audio using the recognition hypothesis are investigated for their efficacy in the automatic grader. The proposed system is shown to predict grades at a similar level to the original examiner graders on real candidate entries. Interpolation with the examiner grades further boosts performance. The ability to reject poorly estimated grades is also important and measures are proposed to evaluate the performance of rejection schemes. The GP variance is used to decide which automatic grades should be rejected. Back-off to an expert grader for the least confident grades gives gains.
Highlights
There is a high demand around the world for the learning of English as a second language
In [14] the deep neural networks (DNN)-based automatic speech recognition (ASR) system gave 31% relative word error rate (WER) reduction on the data from the Arizona English Language Learner Assessment (AZELLA) test, which is composed of a variety of spoken tasks developed by professional educators
Fluency features are derived from the speech recognition system hypothesis, time aligned to the audio
Summary
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. ALTA Institute / Department of Engineering, University of Cambridge, Cambridge, U.K
Accepted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.