Abstract

In this study, a voice activity detection technique is designed using features such as short-term energy, periodicity and spectral flatness. The desired results are obtained by using these three features, even at low signal to noise ratio values. In addition, performance of multi-channel noise reduction algorithms such as Wiener speech distortion weighted, spatial prediction, minimum variance distortion-less response are compared using the proposed voice activity detection. Two different audio signals and three different noise types are used in the experiment. Noisy speech and only detection of noisy areas have been performed by proposed voice activity detection algorithm. The filter coefficients have been calculated for each filter algorithm used after detection of noisy speech and only noisy areas. The calculated filter coefficients have been multiplied by the frequency components of the signal received from the reference microphone to obtain an enhanced signal. Segmental signal to noise ratio, an objective method, and mean opinion score as a subjective method have been used to evaluate the performance of the filters. Speech distortion weighted Wiener filter has been found to be the best filter for noise reduction performance.

Highlights

  • One of the most fundamental problems affecting the quality of speech in the communication industry is noise

  • Experimental results and theoretical analysis It has been determined that the maximum signal to noise ratio (SNR), Wiener and Trade off filter are identified with the minimum variance distortion-less response (MVDR) filter using a scaling factor, but since this scaling parameter will cause distortion in the speech signal, the MVDR filter is recommended in speech enhancement applications

  • According to the experimental results, the best results have been obtained with white noise and quite good results have been provided in detection of the frames with speech but some parts of the algorithm need to be improved for detecting the noisy frames

Read more

Summary

INTRODUCTION

One of the most fundamental problems affecting the quality of speech in the communication industry is noise. In the study by Itzhak et al [17] present a modified optimization criterion according to which the proposed filters may be derived, and compare their performances to conventional multichannel noise reduction filters They show that the new approach is preferable, in particular when the input signal-to-noise ratio (SNR) is low or the number of sensors is small. The proposed VAD technique provides an online unified framework to overcome the frequent false rejection of the statistical-model-based likelihood-ratio test (LRT) in noisy environments This method is based on the observation that the sparseness of speech and background noise cause high false-rejection error rates in statistical LRT-based VAD—the false rejection rate increases as the sparseness increases [21]. SIGNAL MODEL AND SHORT-TIME FEATURES This section provides information about microphone signals and short-time features used in the proposed algorithm

SIGNAL MODEL
MINIMUM VARIANCE DISTORTION-LESS RESPONSE FILTER
PROPOSED ALGORITHM
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call