Abstract
We investigate the problem of automatic detection of annotation errors in single-speaker read-speech corpora used for textto-speech (TTS) synthesis. Various word-level feature sets were used, and the performance of several detection methods based on support vector machines, extremely randomized trees, knearest neighbors, and the performance of novelty and outlier detection are evaluated. We show that both word- and utterancelevel annotation error detections perform very well with both high precision and recall scores and with F1 measure being almost 90%, or 97%, respectively. Index Terms: annotation error detection, classification, novelty detection, read speech corpora, speech synthesis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have