Towards a generic approach for automatic speech recognition error detection and classification

Rahhal Errattahi,Hassan Ouahmane,Thomas Hain,Asmaa El Hannani

doi:10.1109/atsip.2018.8364511

Abstract

Automatic Speech Recognition (ASR) errors are essentially unavoidable. This premise motivates the attempts to develop post hoc tools that tackle the ASR errors. This paper addresses the problem of errors in continuous speech recognition outputs to improve the exploitation of ASR transcriptions. We propose a generic classifier-based approach for both error detection and error type classification. Unlike the majority of research in this field, we handle the recognition errors independently from the ASR decoder using a set of features derived exclusively from the recognizer output and hence should be usable with any ASR system. As a result, experiments on TV program transcription data have shown that the proposed non-decoder features setup leads to achieve competitive performances, compared to state of the art systems, in ASR error detection and classification. Furthermore, we have shown that Support Vector Machines trained on the proposed features set appear to be an effective classifier for the ASR error type classification with an Accuracy of 82.41%.

Full Text