Abstract

This paper presents the application of Neural Network Bottleneck (BN) features in Language Identification (LID). BN f eatures are generally used for Large Vocabulary Speech Recognition in conjunction with conventional acoustic features, s uch as MFCC or PLP. We compare the BN features to several common types of acoustic features used in the state-of-the-art LID systems. The test set is from DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state-of-the-art detection capabilities on audio from hig hly degraded radio communication channels. On this type of noisy data, we show that in average, the BN features provide a 45% relative improvement in the Cavgor Equal Error Rate (EER) metrics across several test duration conditions, with resp ect to our single best acoustic features. Index Terms: language identification, noisy speech, robust feature extraction

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call