Abstract

We investigate the problem of automatic detection of annotation errors in single-speaker read-speech corpora used for textto-speech (TTS) synthesis. Various word-level feature sets were used, and the performance of several detection methods based on support vector machines, extremely randomized trees, knearest neighbors, and the performance of novelty and outlier detection are evaluated. We show that both word- and utterancelevel annotation error detections perform very well with both high precision and recall scores and with F1 measure being almost 90%, or 97%, respectively. Index Terms: annotation error detection, classification, novelty detection, read speech corpora, speech synthesis

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call