Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

Natalie Yu-Hsien Wang,Szu-Wei Fu,Xugan Lu,Hsin-Min Wang,Hsiao-Lan Sharon Wang,Yu Tsao,Tao-Wei Wang

doi:10.1109/tnsre.2020.3042655

Abstract

Combined electric and acoustic stimulation (EAS) has demonstrated better speech recognition than conventional cochlear implant (CI) and yielded satisfactory performance under quiet conditions. However, when noise signals are involved, both the electric signal and the acoustic signal may be distorted, thereby resulting in poor recognition performance. To suppress noise effects, speech enhancement (SE) is a necessary unit in EAS devices. Recently, a time-domain speech enhancement algorithm based on the fully convolutional neural networks (FCN) with a short-time objective intelligibility (STOI)-based objective function (termed FCN(S) in short) has received increasing attention due to its simple structure and effectiveness of restoring clean speech signals from noisy counterparts. With evidence showing the benefits of FCN(S) for normal speech, this study sets out to assess its ability to improve the intelligibility of EAS simulated speech. Objective evaluations and listening tests were conducted to examine the performance of FCN(S) in improving the speech intelligibility of normal and vocoded speech in noisy environments. The experimental results show that, compared with the traditional minimum-mean square-error SE method and the deep denoising autoencoder SE method, FCN(S) can obtain better gain in the speech intelligibility for normal as well as vocoded speech. This study, being the first to evaluate deep learning SE approaches for EAS, confirms that FCN(S) is an effective SE approach that may potentially be integrated into an EAS processor to benefit users in noisy environments.

Highlights

A COCHLEAR implant (CI) is a surgically implanted electronic device that stimulates auditory nerves to provide a sense of sound for people with severe-to-profound sensorineural hearing loss
This study is the first to investigate the effectiveness of deeplearning-based speech enhancement (SE) methods on electric and acoustic stimulation (EAS) simulated speech
We focused on comparisons between the recently developed fully convolutional neural networks (FCN)(S) SE approach, a conventional minimum-mean square-error (MMSE) SE approach, and deep-learning-based deep denoising autoencoder (DDAE) SE approach at two different SNRs in engine and street noisy environments

Summary

INTRODUCTION

A COCHLEAR implant (CI) is a surgically implanted electronic device that stimulates auditory nerves to provide a sense of sound for people with severe-to-profound sensorineural hearing loss. Fu et al [63] proposed the use of a fully convolutional neural network (FCN) model for SE in the time domain, which can preserve the neighbouring information of a speech waveform to generate highand low-frequency components Their experimental results show that, compared with CNN and deep neural networks, the FCN model yields better speech intelligibility in terms of short-time objective intelligibility (STOI) with fewer parameters. Experimental results confirmed that the DDAE-based method outperforms three commonly used single-microphone SE approaches (logMMSE, KLT, and Wiener filter) in terms of intelligibility, evaluated with STOI, and speech recognition, evaluated with listening tests These results confirmed the potential of applying deep learning models to improve CI devices.

VOCODED SPEECH

EXPERIMENTAL SETUP AND RESULTS

Evaluation on Normal Speech

Evaluation on Vocoded Speech

Findings

CONCLUSION

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society	Publication Date: Jan 1, 2021
Citations: 15	License type: CC BY 4.0

R Discovery Prime

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society

Lead the way for us

Similar Papers

Noise-management algorithm may improve speech intelligibility in noise
Francis K Kuk ... Carsten Paludan-Müller
The Hearing Journal | VOL. 59
Francis K Kuk, et. al.Francis K Kuk ... Carsten Paludan-Müller
01 Apr 2006
The Hearing Journal | VOL. 59

Combined Acoustic and Electric Stimulation
Michael F Dorman
The ASHA Leader | VOL. 16
Michael F DormanMichael F Dorman
01 Mar 2011
The ASHA Leader | VOL. 16

Hearing Preservation in Patients With a Cochlear Implant
René H Gifford ... Jon K Shallop
The ASHA Leader | VOL. 12
René H Gifford, et. al.René H Gifford ... Jon K Shallop
01 Oct 2007
The ASHA Leader | VOL. 12

Combined Electric and Acoustic Stimulation (EAS) in Children: Investigating Benefit Afforded by Bilateral Versus Unilateral Acoustic Hearing.
Jillian B Roberts ... René H Gifford
Otology & Neurotology | VOL. 42
Jillian B Roberts, et. al.Jillian B Roberts ... René H Gifford
14 Apr 2021
Otology & Neurotology | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society