Abstract

Researchers in the field of speech processing have focused on minimizing the impact of environmental noise that decreases the performance of the systems such as speaker recognition and speech recognition. The speech enhancement approach proposed for denoising the speech signals corrupted by various noise is based on Bionic Wavelet Transform (BWT) and mean square error. Thresholding method is used for denoising the signal based on its spectral amplitude. The inverse bionic wavelet function is applied to denoised coefficient obtained at the enhanced speech signals. The speech quality measures and speech intelligence measures are used to assess the performance of the suggested technique. Proposed methodology is compared with Continuous Wavelet Transform (CWT) approach. The Mel frequency cepstral coefficient (MFCC) feature is extracted from the denoised signal for speaker recognition. Machine learning classifiers such as K-nearest neighbors (KNN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN) are used for recognizing the speaker. Six different speakers were recognized efficiently by CNN technique when compared to SVM and KNN. CNN technique with 2500 database shows the training accuracy of 98% and test accuracy of 82% for enhanced signal.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.