Abstract

Deep neural networks have been applied for speech enhancements efficiently. However, for large variations of speech patterns and noisy environments, an individual neural network with a fixed number of hidden layers causes strong interference, which can lead to a slow learning process, poor generalisation in an unknown signal-to-noise ratio in new inputs, and some residual noise in the enhanced output. In this paper, we present a new approach for the hearing impaired based on combining two stages: (1) a set of bandpass filters that split up the signal into eight separate bands each performing a frequency analysis of the speech signal; (2) multiple deep denoising autoencoder networks, with each working for a small specific enhancement task and learning to handle a subset of the whole training set. To evaluate the performance of the approach, the hearing-aid speech perception index, the hearing aid sound quality index, and the perceptual evaluation of speech quality were used. Improvements in speech quality and intelligibility were evaluated using seven subjects of sensorineural hearing loss audiogram. We compared the performance of the proposed approach with individual denoising autoencoder networks with three and five hidden layers. The experimental results showed that the proposed approach yielded higher quality and was more intelligible compared with three and five layers.

Highlights

  • Speech is a fundamental means of human communication

  • We propose a new method for amplification in hearing aids based on combining two stages: (1) a set of bandpass filters, in which each performs a frequency analysis of the speech signal based on the human healthy cochlea, (2) multiple deep denoising autoencoder networks (DDAE) networks, in which each works for a specific enhancement task and learns to handle a subset of the whole training set

  • We investigated the performance of the deep denoising autoencoder for speech enhancement

Read more

Summary

Introduction

Speech is a fundamental means of human communication. In most noisy situations, the speech signal is mixed with other signals transmitting energy at the same time, which can be noise or even different speech signals. For hearing aids (HA), SE algorithms are used to somehow clean the noisy signal before amplification by reducing the background noise, where hearingimpaired users experience extreme difficulty communicating in environments with varying levels and types of noise (caused by the loss of temporal and spectral resolution in the auditory system of the impaired ear) [1]. In many scenarios, reducing the background noise introduces speech distortion, which reduces speech intelligibility in noisy environments [2]. The intelligibility is an objective measure since it offers the percentage of words that can be correctly identified by listeners. Based on these two criteria, the considerable challenge in designing an effective SE algorithm for hearing aids is to boost the overall speech quality and to increase intelligibility by suppressing noise without introducing any perceptible distortion in the signal

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call