Abstract

The presence of velopharyngeal dysfunction, dental occlusion, and mislearned articulation in individuals with cleft lip and palate (CLP) results in the production of misarticulated stop consonants. The present work considers vowel onset points (VOPs) as the anchor points, around which the consonant-vowel (CV) transition regions are segmented to analyze the difference between normal and misarticulated stops. VOPs are located using an epoch-synchronously computed feature called maximum weighted inner product. Spectro-temporal dynamics of CV transitions anchored around VOP are analyzed using two-dimensional discrete cosine transform (2D-DCT) coefficients, where 2D-DCT coefficients are derived from single pole filter (SPF) based time-frequency representation. The SPF-based 2D-DCT coefficients are used to train a support vector machine for the classification of normal and misarticulated stops, where the class of misarticulated stops includes weak, nasalized, palatal, velar, pharyngeal, glottal, and devoicing errors produced by CLP speakers. The performance of the proposed VOP detection algorithm is evaluated on a database containing CV units of normal and misarticulated stops, and the results are compared with the state-of-the-art VOP detection methods. The classification results obtained for the proposed SPF-based 2D-DCT coefficients are compared with the short-time Fourier transform-based 2D-DCT coefficients and Mel-frequency cepstral coefficients. Further, the performance of the proposed system is compared with the hidden Markov model-based goodness of pronunciation approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call