Abstract

Vowel onset point (VOP) is the instant at which the onset of vowel takes place in the speech signal. Accurate detection of VOP is useful for applications such as consonant–vowel (CV) unit recognition and speech rate modification. Existing VOP detection methods determine VOPs within 40ms deviation, which may not be suitable for the applications mentioned above. In this paper, a two level approach using multiple sources of evidence is proposed for the accurate detection of VOP. In the proposed method, at the first level, VOPs are identified by combining the complementary evidence from excitation source, spectral peaks and modulation spectrum. At the second level, hypothesized VOPs are verified (genuine or spurious), and their positions are corrected using the uniform epoch intervals present in vowel region. Zero frequency filter method is used to determine the epoch locations in speech. Performance of the proposed method is analyzed using TIMIT database, and compared with the recent method which uses the combination of evidence from excitation source, spectral peaks and modulation spectrum. Using the proposed method about 85% of VOPs are detected within 10ms deviation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call