AbstractSilent speech interfaces offer an alternative and efficient communication modality for individuals with voice disorders and when the vocalized speech communication is compromised by noisy environments. Despite the recent progress in developing silent speech interfaces, these systems face several challenges that prevent their wide acceptance, such as bulkiness, obtrusiveness, and immobility. Herein, the material optimization, structural design, deep learning algorithm, and system integration of mechanically and visually unobtrusive silent speech interfaces are presented that can realize both speaker identification and speech content identification. Conformal, transparent, and self‐adhesive electromyography electrode arrays are designed for capturing speech‐relevant muscle activities. Temporal convolutional networks are employed for recognizing speakers and converting sensing signals into spoken content. The resulting silent speech interfaces achieve a 97.5% speaker classification accuracy and 91.5% keyword classification accuracy using four electrodes. The speech interface is further integrated with an optical hand‐tracking system and a robotic manipulator for human‐robot collaboration in both assembly and disassembly processes. The integrated system achieves the control of the robot manipulator by silent speech and facilitates the hand‐over process by hand motion trajectory detection. The developed framework enables natural robot control in noisy environments and lays the ground for collaborative human‐robot tasks involving multiple human operators.
Read full abstract