Abstract

ABSTRACT Automatic speech recognition (ASR) is an emerging technology that has been used in recognizing non-typical speech of people with speech impairment and enhancing the language sample transcription process in communication sciences and disorders. However, the feasibility of using ASR for recognizing speech samples from high-tech Augmentative and Alternative Communication (AAC) systems has not been investigated. This proof-of-concept paper aims to investigate the feasibility of using AAC-ASR to transcribe language samples generated by high-tech AAC systems and compares the recognition accuracy of two published ASR models: CMU Sphinx and Google Speech-to-text. An AAC-ASR model was developed that transcribes simulated AAC speaker language samples. The AAC-ASR model’s word error rate (WER) was compared with those of CMU Sphinx and Google Speech-to-text. The WER of the AAC-ASR model outperformed (28.6%) compared with CMU Sphinx and Google when tested on the testing files (70.7% and 86.2% retrospectively). Our results demonstrate the feasibility of using the ASR model to automatically transcribe high-technology AAC-simulated language samples to support language sample analysis. Future steps will focus on developing the model with diverse AAC speech training datasets and understanding the speech patterns of individual AAC users to refine the AAC-ASR model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call