Abstract

Building a voice-operated system for learning disabled users is a difficult task that requires a considerable amount of time and effort. Due to the wide spectrum of disabilities and their different related phonopathies, most approaches available are targeted to a specific pathology. This may improve their accuracy for some users, but makes them unsuitable for others. In this paper, we present a cross-lingual approach to adapt a general-purpose modular speech recognizer for learning disabled people. The main advantage of this approach is that it allows rapid and cost-effective development by taking the already built speech recognition engine and its modules, and utilizing existing resources for standard speech in different languages for the recognition of the users’ atypical voices. Although the recognizers built with the proposed technique obtain lower accuracy rates than those trained for specific pathologies, they can be used by a wide population and developed more rapidly, which makes it possible to design various types of speech-based applications accessible to learning disabled users.

Highlights

  • Millions of individuals suffer from learning disabilities that affect their speech production

  • 3 Proposed method As the development of speech recognition technologies starting from scratch is a very time- and resourceconsuming process, we propose to avoid these costs by means of cross-lingual adaptation

  • TP means true positives, FP means false positives, FN means false negatives, Nkw denotes the number of keywords in the vocabulary, Dur stands for the total duration of recordings, and Nrec is the number of words in the reference transcription that really appear in each audio recording

Read more

Summary

Introduction

Millions of individuals suffer from learning disabilities that affect their speech production. These conditions result in atypical voices that are very difficult to understand even for human listeners, as they may affect one or more of the major language subsystems, including phonology, morphology, syntax and semantics. Focusing on phonology, impaired speech can affect voice timing, pitch, volume, fluency and articulation [1]. Different studies have focused on the nature of such mispronunciations and their impact in intelligibility. In [3], the authors focus on how to measure the intelligibility of atypical voices objectively along different perceptual dimensions

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.