Abstract

The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with “quasi-unvoiced” or with “quasi-voiced” initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.

Highlights

  • Cleft Palate (CP) is a common congenital malformation caused by craniofacial alternation

  • The I/F segmentation is implemented in two steps: syllable segmentation and I/F segmentation

  • To achieve the I/F segmentation in cleft palate speech in this work, considering that some initials are very short, the time duration of a speech frame is chosen shorter than usual frame length to obtain more accurate I/F boundary locations

Read more

Summary

Introduction

Cleft Palate (CP) is a common congenital malformation caused by craniofacial alternation. To achieve the I/F segmentation in cleft palate speech in this work, considering that some initials are very short, the time duration of a speech frame is chosen shorter than usual frame length to obtain more accurate I/F boundary locations. Automatic initial and final segmentation in Mandarin cleft palate speech The proposed system contains two main procedures: syllable extraction and I/F segmentation. For the syllables with quasi-voiced initials, the segmentation method is based on short-time autocorrelation and waveform shape difference between initials and finals. A two-step segmentation method is proposed to get I/F boundaries for syllables with quasi-unvoiced initials: locating the rough I/F boundaries and I/F boundaries refinement.

Experiments and results
Findings
Conclusions and discussions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call