Abstract

Automatic tools to detect hypernasality have been traditionally designed to analyze sustained vowels exclusively. This is in sharp contrast with clinical recommendations, which consider it necessary to use a variety of utterance types (e.g., repeated syllables, sustained sounds, sentences, etc.) This study explores the feasibility of detecting hypernasality automatically based on speech samples other than sustained vowels. The participants were 39 patients and 39 healthy controls. Six types of utterances were used: counting 1-to-10 and repetition of syllable sequences, sustained consonants, sustained vowel, words and sentences. The recordings were obtained, with the help of a mobile app, from Spain, Chile and Ecuador. Multiple acoustic features were computed from each utterance (e.g., MFCC, formant frequency) After a selection process, the best 20 features served to train different classification algorithms. Accuracy was the highest with syllable sequences and also with some words and sentences. Accuracy increased slightly by training the classifiers with between two and three utterances. However, the best results were obtained by combining the results of multiple classifiers. We conclude that protocols for automatic evaluation of hypernasality should include a variety of utterance types. It seems feasible to detect hypernasality automatically with mobile devices.

Highlights

  • Speech is described as hypernasal when there is an abnormal increase in nasal resonance during the production of oral sounds

  • Hypernasality (HN) is commonly observed in patients with cleft palate (CP) and in other groups of patients who have short velum which cannot achieve a complete contact with the posterior pharyngeal wall

  • This study explores to what extent utterances other than sustained vowels can be used to detect HN automatically

Read more

Summary

Introduction

Speech is described as hypernasal when there is an abnormal increase in nasal resonance during the production of oral sounds. This condition results from an insufficient closure of the velopharyngeal port that allows the air stream to flow through the nasal cavity during the production of oral vowels and consonants. One promising approach consists in using automatic classification systems trained with different sets of acoustic features [5]. This approach has two major advantages: it is non-invasive and the required technology is nowadays universally available (e.g., by using mobile phones)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call