Abstract

The main goal of Speech enhancement is to enhance the performance of speech communication systems in noisy environments. The problem of enhancing speech which is corrupted by noise is very large, although a lot of techniques have been introduced by the researchers over the past years. This problem is more severe when there is no additional information on the nature of noise degradation is available in which case the enhancement technique must utilize only the specific properties of the speech and noise signals. Signal representation and enhancement in cosine transformation is observed to provide significant results. Discrete Cosine Transformation has been widely used for speech enhancement. In this research study, instead of DCT, a hybrid technique called DCTSLT which is the combination of Discrete Cosine Transform (DCT) and Slantlet Transform (SLT) is proposed for continuous energy compaction along with critical sampling and flexible window switching. In order to deal with the issue of frame to frame deviations of the Cosine Transformations, the proposed transform is combined with Time Domain Pitch Synchronous Overlap-Add (TD-PSOLA) method. Moreover, in order to improve the performance of noise reduction of the system, a Hybrid Vector Wiener Filter approach (HVWF) is used in this study. Experimental result shows that the proposed system performs well in enhancing the speech as compared with other techniques.

Highlights

  • In several speech communication systems, recognition of speech signal from a degraded speech signal with back-ground noise is a tedious task at low SNR values

  • The proposed hybrid technique called a combination of discrete cosine transform and Slantlet Transform based Hybrid Vector Wiener Filter approach (HVWF) technique is evaluated using two objective measures, segmental SNR (SegSNR) measure and Perceptual Evaluation of Speech Quality (PESQ) measure

  • Since SegSNR is better interrelated with Mean Opinion Score (MOS) than SNR (Balaji and Subramanian, 2014) and is effortless to implement and it has been widely used to meet the criteria of the enhanced speech

Read more

Summary

Introduction

In several speech communication systems, recognition of speech signal from a degraded speech signal with back-ground noise is a tedious task at low SNR values. Spectral subtraction is the most basic method for enhancing speech corrupted by preservative noise (Boll, 1979). This technique calculates the spectrum of the dirt free noise-signal by the subtraction of the estimated noise magnitude spectrum from the noisy signal magnitude spectrum whereas keeping the phase spectrum of the noisy signal. The disadvantage of this technique is that, it contains residual noise. It is used for enhancing a speech signal corrupted by non correlated additive noise or colored noise (Ephraim and Van Trees, 1995)

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.