Abstract

Support vector machines (SVMs) have played an important role in the state-of-the-art language recognition systems. The recently developed extreme learning machine (ELM) tends to have better scalability and achieve similar or much better generalization performance at much faster learning speed than traditional SVM. Inspired by the excellent feature of ELM, in this paper, we propose a novel method called regularized minimum class variance extreme learning machine (RMCVELM) for language recognition. The RMCVELM aims at minimizing empirical risk, structural risk, and the intra-class variance of the training data in the decision space simultaneously. The proposed method, which is computationally inexpensive compared to SVM, suggests a new classifier for language recognition and is evaluated on the 2009 National Institute of Standards and Technology (NIST) language recognition evaluation (LRE). Experimental results show that the proposed RMCVELM obtains much better performance than SVM. In addition, the RMCVELM can also be applied to the popular i-vector space and get comparable results to the existing scoring methods.

Highlights

  • The task of spoken language recognition is to automatically determine the identity of the language spoken in a speech sample [1]

  • We can note that the proposed regularized minimum class variance extreme learning machine (RMCVELM) outperforms support vector machines (SVMs) at all the three different durations

  • RMCVELM can greatly enhance the performance at the 30-s duration, obtaining at most 15 % (EER) and 19 % (Cavg) relative improvement compared with SVM, respectively

Read more

Summary

Introduction

The task of spoken language recognition is to automatically determine the identity of the language spoken in a speech sample [1]. It has received increasing attention due to its importance for the enhancement of automatic speech recognition (ASR) [2] and multi-language translation systems [3, 4]. The most popular language recognition techniques can usually be categorized as phonotactic [5–7] and acoustic approaches [8, 9]. The former method is based upon phone recognizer followed by language models (PRLM), parallel PRLM (PPRLM), and supper vector machines PRLM. The introduction of support vector machines (SVMs) into language recognition resulted in significant improvement in performance [10]. SVMs have proven to be an effective method and have been widely used for many

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call