Abstract

The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday language employing the TTS algorithm developed. For tests, an application is built, containing a questionnaire allowing for evaluating the quality and naturalness of the synthesized speech, for both types of language. It is followed by the algorithm efficiency tests. A presentation of the performed tests, along with the results obtained from 30 respondents, is shown. The discussion consists of a statistical analysis of the obtained results and a comparison with other speech recognition solutions used as a reference. Finally, in the summary section, there is an overall conclusion of this approach and promising directions for future development. [This work is supported by the Polish National Center for Research and Development (NCBR) project: “ADMEDVOICE-Adaptive intelligent speech processing system of medical personnel with the structuring of test results and support of therapeutic process,” no. INFOSTRATEG4/0003/2022.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call