Improved voice activity detection using static harmonic features

Takashi Fukuda,Osamu Ichikawa,Masafumi Nishimura

doi:10.1109/icassp.2010.5495598

Improved voice activity detection using static harmonic features

Takashi Fukuda, Osamu Ichikawa + Show 1 more

https://doi.org/10.1109/icassp.2010.5495598

Copy DOI

Publication Date: Jan 1, 2010

Citations: 14

Affiliation: IBM Research - Tokyo

#Voice Activity Detection #Improved Voice Activity Detection + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

Accurate voice activity detection (VAD) is important for robust automatic speech recognition (ASR) systems. We have proposed a statistical-model-based VAD using the long-term temporal information in speech, which shows good robustness against noise in an automobile environment. For further improvement, this paper describes a new method to exploit harmonic structure information with statistical models. In our approach, local peaks considered to be harmonic structures are extracted, without explicit pitch detection and voiced-unvoiced classification. The proposed method including both long-term temporal and static harmonic features led to considerable improvements under low SNR conditions in our VAD testing. In addition, the word error rate was reduced by 29.1% in a test that included a full ASR system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.