SVoice

Yongjian Fu,Lili Chen,Ju Ren,Yaoxue Zhang,Shuning Wang,Linghui Zhong

doi:10.1145/3560905.3568530

Abstract

Silent Speech Interface (SSI) has been proposed as a means of reconstructing audible speech from silent articulatory gestures for covert voice communication in public and voice assistance for the aphasic. Prior arts of SSI, either relying on wearable devices or cameras, may lead to extended contact requirements or privacy leakage risks. The recent advances in acoustic sensing have brought new opportunities for sensing gestures, but their original intention is to infer speech content for classification instead of audible speech reconstruction, resulting in the loss of some important speech information (e.g., speech rate, intonation, and emotion). In this paper, we propose, the first system that supports accurate audible speech reconstruction by analyzing the disturbance of tiny articulatory gestures on the reflected ultrasound signal. The design of introduces a new model that provides the unique mapping relationship between ultrasound and speech signals, so that the audible speech can be successfully reconstructed from the silent speech. However, establishing the mapping relationship depends on plenty of training data. Instead of the time-consuming collection of massive amounts of data for training, we construct an inverse task that constitutes a dual form with the original task to generate virtual gestures from widely available audio (e.g., phone calls) for facilitating model training. Furthermore, we introduce a fine-tuning mechanism using unlabeled data for user adaptation. We implement using a portable smartphone and evaluate it in various environments. The evaluation results show that can reconstruct speech with a (Character Error Rate) CER as low as 7.62%, and decrease the CER from 82.77% to 9.42% on new users with only 1 hour of ultrasound signals provided, which outperforms state-of-the-art acoustic-based approaches while preserving rich speech information.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SVoice

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A silent speech system based on permanent magnet articulography and direct synthesis
Jose A Gonzalez ... Roger K Moore
Computer Speech & Language | VOL. 39
Jose A Gonzalez, et. al.Jose A Gonzalez ... Roger K Moore
14 Mar 2016
Computer Speech & Language | VOL. 39

A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion
Takuto Moriguchi ... Graham Neubig
-
Takuto Moriguchi, et. al.Takuto Moriguchi ... Graham Neubig
25 Aug 2013
25 Aug 2013

Decoding silent speech commands from articulatory movements through soft magnetic skin and machine learning.
Penghao Dong ... Si Chen
Materials horizons | VOL. 10
Penghao Dong, et. al.Penghao Dong ... Si Chen
01 Jan 2023
Materials horizons | VOL. 10

Statistical mapping between articulatory and acoustic data for an ultrasound-based silent speech interface
Thomas Hueber ... Bruce Denby
-
Thomas Hueber, et. al.Thomas Hueber ... Bruce Denby
27 Aug 2011
27 Aug 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SVoice

Abstract

Talk to us

Similar Papers