In this paper, the encoding of noisy and enhanced speech data is demonstrated. To encode and enhance the speech data under an uncontrolled environment, the linear predictive coding (LPC) and spectral subtraction with voice activity detection (SS-VAD) methods are studied individually. The noisy speech data is obtained by considering the amalgamation of the clean speech signal and noise model and it is encoded using the LPC technique. The LPC uses a lossy compression procedure to encode the speech data which converts the data rate from 64 to 2.4 Kbps. Due to reverberations and degradations in noisy speech data, the quality of encoded noisy speech data is very less. Therefore, an algorithm is proposed to enhance and encode the speech data by combining SS-VAD and LPC under degraded conditions. In the first step, the encoding of noisy speech data is done using LPC and its performance is evaluated using signal-to-ratio. The noisy speech data is given as input to the SS-VAD algorithm and the output of SS-VAD is given as input to the LPC encoder is followed in the second step. In the LPC encoder, the coefficients are extracted from the input speech data to design all-pole filters. The cross correlation process is also done for differentiating the voiced and unvoiced samples at the analysis step. The pitch information and extracted coefficients are used in the synthesis step. The experiments are conducted for different types of noisy speech data which are degraded by musical noise, F16 noise, factory noise, and car noise. The experimental results show that there is a significant improvement in the quality of enhanced encoded speech data obtained by the proposed method compared to encoded noisy speech data. The schematic representation of outputs of LPC and proposed combined SS-VAD and LPC waveforms are also given in this work.
Read full abstract