Abstract

A deep learning speech enhancement algorithm based on dynamic hybrid feature and adaptive mask and DSP implementation is proposed in this paper, which solves the problem of feature loss and improves the performance of speech enhancement. The dynamic features incorporate the log Mel power spectrum, Mel cepstral coefficients, and Multiresolution Auditory Cepstral Coefficients (MRACC) and capture the speech transient information by deriving the derivatives to comprehensively represent the nonlinear structure of speech and reduce distortion. To make the system improve the speech quality while reducing the speech distortion as much as possible, a soft mask that can be adaptively adjusted considering the signal-to-noise ratio information is proposed, which can be automatically adjusted according to the different speech signal-to-noise ratio information to obtain the mask value under the corresponding signal-to-noise ratio conditions, and phase difference information that can improve the speech intelligibility is incorporated in it. Then, an improved deep neural network model is designed to effectively improve the speech enhancement performance. Finally, the hardware and algorithm software design of the DSP-based speech enhancement system is given. Experimental simulations are carried out for multiple voices in different noise backgrounds. The experimental results indicate that the performance indexes of the proposed method are significantly improved compared with the existing speech enhancement methods, which verifies the feasibility and superiority of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call