Abstract

This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders. SSIs can employ a variety of biosignals to enable silent communication, such as electrophysiological recordings of neural activity, electromyographic (EMG) recordings of vocal tract movements or the direct tracking of articulator movements using imaging techniques. Depending on the disorder, some sensing techniques may be better suited than others to capture speech-related information. For instance, EMG and imaging techniques are well suited for laryngectomised patients, whose vocal tract remains almost intact but are unable to speak after the removal of the vocal folds, but fail for severely paralysed individuals. From the biosignals, SSIs decode the intended message, using automatic speech recognition or speech synthesis algorithms. Despite considerable advances in recent years, most present-day SSIs have only been validated in laboratory settings for healthy users. Thus, as discussed in this paper, a number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications. If these issues can be addressed successfully, future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.

Highlights

  • Speech is the most convenient and natural form of human communication

  • CONCLUDING REMARKS In this paper, we review recent attempts to decode speech from non-acoustic biosignals generated during speech production, ranging from capturing the movement of the speech articulators to recording brain activity

  • Speech can be decoded by automatic speech recognition (ASR) or by direct speech synthesis

Read more

Summary

INTRODUCTION

Speech is the most convenient and natural form of human communication. normal speech communication is not always possible. Gonzalez-Lopez et al.: SSIs for Speech Restoration: A Review people with speech impairments often develop feelings of personal isolation and social withdrawal, which can lead to clinical depression [5]–[11]. Apart from clinical uses, other potential applications of this technology include providing privacy, enabling telephone conversations to be held without being overheard by bystanders and enhancing normal spoken communication in noisy environments [17], [38] These applications are possible because biosignals are largely insensitive to environmental noise and are independent of the acoustic speech signal (i.e., these biosignals can be captured even when no vocalisation is performed).

SPEECH AND LANGUAGE DISORDERS
SILENT SPEECH TO TEXT
DIRECT SPEECH SYNTHESIS
COMPARISON OF THE TWO SSI APPROACHES
SENSING TECHNIQUES
CURRENT CHALLENGES AND FUTURE RESEARCH DIRECTIONS SS
Findings
CONCLUDING REMARKS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call