Abstract

Large-margin estimation (LME) holds a property of good generalization on unseen test data. In our previous work, LME of HMMs has been successfully applied to some small-scale speech recognition tasks, using the SDP (semi-definite programming) technique. In this paper, we further extend the previous work by exploring a more efficient convex optimization method with the technique of second-order cone programming (SOCP). More specifically, we have studied and proposed several SOCP relaxation techniques to convert LME of HMMs in speech recognition into a standard SOCP problem so that LME can be solved with more efficient SOCP methods. The formulation is general enough to deal with various types of competing hypothesis space, such as N-best lists and word graphs. The proposed LME/SOCP approaches have been evaluated on two standard speech recognition tasks. The experimental results on the TIDIGITS task show that the SOCP method significantly outperforms the gradient descent method, and achieve comparable performance with SDP, but with 20-200 times faster speed, requiring less memory and computing resources. Furthermore, the proposed LME/SOCP method has also been successfully applied to a large vocabulary task using the Wall Street Journals (WSJ0) database. The WSJ-5k recognition results show that the proposed method yields better performance than the conventional approaches including maximum-likelihood estimation (MLE), maximum mutual information estimation (MMIE), and more recent boosted MMIE methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.