Abstract

Individuals who have undergone laryngectomy often rely on handheld transducers (i.e., the electrolarynx) to excite the vocal tract and produce speech. Widely used electrolarynx designs are limited, in that they require manual control of voice activity and pitch modulation. It would be advantageous to have an interface that requires less training, perhaps using the remaining, intact speech production system as a scaffold. Strong evidence exists that aspects of head motion and facial gestures are highly correlated with gestures of voicing and pitch. Therefore, the goal of project MANATEE is to develop an electrolarynx control interface which takes advantage of those correlations. The focus of the current study is to determine the feasibility of using head and facial features to accurately and efficiently modulate the pitch of speaker's electrolarynx in real time on a mobile platform using the built-in video camera. A prototype interface, capable of running on desktop machines and compatible Android devices, is implemented using OpenCV for video feature extraction and statistical prediction of the electrolarynx control signal. Initial performance evaluation is promising, showing pitch prediction accuracies at double the chance-level baseline, and prediction delays well below the perceptually-relevant, ~50 ms threshold.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.