Abstract

Background noise is acoustically added with human speech while communicating with others. Nowadays, many researchers are working on voice/speech activity detection (VAD) in noisy environment. VAD system segregates the frames containing human speech/only noise. Background noise identification has number of applications like speech enhancement, crime investigation. Using background noise identification system, one can identify possible location (street, train, airport, restaurant, babble, car, etc.) during communication. It is useful for security and intelligence personnel for responding quickly by identifying the location of crime. In this paper, using VAD G.729, a new algorithm is proposed for selecting an appropriate set of noisy frames. Mel-frequency cepstral coefficient (MFCC) and linear predictive coding (LPC) are used as feature vectors. These features of selected frames are calculated and passed to the classifier. Using proposed classifier, seven types of noises are classified. Experimentally, it is observed that MFCC is a more suitable feature vector for noise identification through random forest classifier. Here, by selecting appropriate noisy frames through proposed approach accuracy of random forest and SVM classifier increases up to 5 and 3%, respectively. The performance of the random forest classifier is found to be 11% higher than SVM classifier.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.