Abstract

The voice onset time (VOT) of a stop consonant is the interval between its burst onset and voicing onset. Among a variety of research topics on VOT, one that has been studied for years is how VOTs are efficiently measured. Manual annotation is a feasible way, but it becomes a time-consuming task when the corpus size is large. This paper proposes an automatic VOT estimation method based on an onset detection algorithm. At first, a forced alignment is applied to identify the locations of stop consonants. Then a random forest based onset detector searches each stop segment for its burst and voicing onsets to estimate a VOT. The proposed onset detection can detect the onsets in an efficient and accurate manner with only a small amount of training data. The evaluation data extracted from the TIMIT corpus were 2344 words with a word-initial stop. The experimental results showed that 83.4% of the estimations deviate less than 10 ms from their manually labeled values, and 96.5% of the estimations deviate by less than 20 ms. Some factors that influence the proposed estimation method, such as place of articulation, voicing of a stop consonant, and quality of succeeding vowel, were also investigated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call