Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks

Eloi Moliner Juanpere,Tamás Gábor Csapó

doi:10.3813/aaa.919339

Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks

Eloi Moliner Juanpere, Tamás Gábor Csapó

Open Access

https://doi.org/10.3813/aaa.919339

Copy DOI

Journal: Acta Acustica united with Acustica	Publication Date: Jul 1, 2019
Citations: 14	License type: cc-by

#Silent Speech Interface #Deep Convolutional Autoencoder + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Silent Speech Interface (SSI) is a technology with the goal of synthesizing speech from articulatory motion. A Deep Neural Network based SSI using ultrasound images of the tongue as input signals and spectral coefficients of a vocoder as target parameters are proposed. Several deep learning models, such as a baseline Feed-forward, and a combination of Convolutional and Recurrent Neural Networks are presented and discussed. A pre-processing step using a Deep Convolutional AutoEncoder was also studied. According to the experimental results, an architecture based on a CNN and bidirectional LSTM layers has shown the best objective and subjective results.

Full Text