Abstract

Silent speech recognition is the ability to recognise intended speech without audio information. Useful applications can be found in situations where sound waves are not produced or cannot be heard. Examples include speakers with physical voice impairments or environments in which audio transference is not reliable or secure. Developing a device which can detect non-auditory signals and map them to intended phonation could be used to develop a device to assist in such situations. In this work, we propose a graphene-based strain gauge sensor which can be worn on the throat and detect small muscle movements and vibrations. Machine learning algorithms then decode the non-audio signals and create a prediction on intended speech. The proposed strain gauge sensor is highly wearable, utilising graphene’s unique and beneficial properties including strength, flexibility and high conductivity. A highly flexible and wearable sensor able to pick up small throat movements is fabricated by screen printing graphene onto lycra fabric. A framework for interpreting this information is proposed which explores the use of several machine learning techniques to predict intended words from the signals. A dataset of 15 unique words and four movements, each with 20 repetitions, was developed and used for the training of the machine learning algorithms. The results demonstrate the ability for such sensors to be able to predict spoken words. We produced a word accuracy rate of 55% on the word dataset and 85% on the movements dataset. This work demonstrates a proof-of-concept for the viability of combining a highly wearable graphene strain gauge and machine leaning methods to automate silent speech recognition.

Highlights

  • According to the WHO, around 5% of the population worldwide have hearing and speech impairments [1]

  • (3) In order to quantify the performance of sensors, we introduce testing of our word classifiers to provide a measurable output to show the accuracy of the combined sensor and classification system

  • This paper develops on our previous work by exploring multiple machine learning approaches including random forests and k-nearest neighbour classifiers utilising handcrafted feature extraction methods

Read more

Summary

Introduction

According to the WHO, around 5% of the population worldwide have hearing and speech impairments [1]. Silent communication is a technique that can help people with these conditions speak properly due to the conversion of silent attempts to speak into speech. This approach is significantly important for patients who cannot rely on traditional voice signals. It can help individuals who have undergone laryngectomies and may require speech training after surgery to speak clearly and confidently. Aside from helping individuals with speech impairments, this technology could be used in areas where reliable and secure sound delivery is required, such as in locations of high noise

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call