Abstract

Nasality is a very important characteristic of several languages, European Portuguese being one of them. This paper addresses the challenge of nasality detection in surface electromyography (EMG) based speech interfaces. We explore the existence of useful information about the velum movement and also assess if muscles deeper down in the face and neck region can be measured using surface electrodes, and the best electrode location to do so. The procedure we adopted uses Real-Time Magnetic Resonance Imaging (RT-MRI), collected from a set of speakers, providing a method to interpret EMG data. By ensuring compatible data recording conditions, and proper time alignment between the EMG and the RT-MRI data, we are able to accurately estimate the time when the velum moves and the type of movement when a nasal vowel occurs. The combination of these two sources revealed interesting and distinct characteristics in the EMG signal when a nasal vowel is uttered, which motivated a classification experiment. Overall results of this experiment provide evidence that it is possible to detect velum movement using sensors positioned below the ear, between mastoid process and the mandible, in the upper neck region. In a frame-based classification scenario, error rates as low as 32.5% for all speakers and 23.4% for the best speaker have been achieved, for nasal vowel detection. This outcome stands as an encouraging result, fostering the grounds for deeper exploration of the proposed approach as a promising route to the development of an EMG-based speech interface for languages with strong nasal characteristics.

Highlights

  • Speech-based human-computer interfaces have reached high accuracy levels in controlled environments and are commercially available

  • The results of the analysis combining the EMG signal with the information extracted from the Real-Time Magnetic Resonance Imaging (RT-MRI) signal, two classification experiments, and a reproducibility assessment are presented

  • The second experiment divides the EMG signal into frames, but the classification was made by nasal and non-nasal zones, whose limits were known a priori based on the information extracted from the RT-MRI

Read more

Summary

Introduction

Speech-based human-computer interfaces have reached high accuracy levels in controlled environments and are commercially available. The production of a nasal vowel involves air flow through the oral and nasal cavities. This air passage for the nasal cavity is essentially controlled by the velum which, when lowered, allows for the velopharyngeal port to be open, enabling resonance in the nasal cavity, which causes the sound to be perceived as nasal. Its main function is to elevate and retract the soft palate achieving velopharyngeal closure;. Musculus uvulae: This muscle is embodied in the structure of the soft palate. In speech, it helps velopharyngeal closure by filling the space between the elevated velum and the posterior pharyngeal wall [16];

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.