Abstract

Acoustic model topology selection work in constructing large vocabulary speech recognition systems is being done empirically or heuristically. In this paper, we propose two improved algorithms, which are based on Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) respectively, on the basis of our previously proposed algorithms to select and optimize model topologies for small or medium vocabulary speech recognition systems. Our improved algorithms attain the goal of optimizing acoustic model topologies for large vocabulary speech recognition systems mainly through modifying the encoding schemes of our previously proposed algorithms. Experiments on the dialogue corpus of Inner Mongolia University show that, compared with the conventional acoustic model topology selection method, our newly proposed algorithms are able to bring much higher recognition performance for large vocabulary speech recognition systems by optimizing their acoustic model topologies.

Highlights

  • In constructing HMM (Hidden Markov Model) based speech recognition systems, we need to determine the number of states and the number of Gaussian kernels per state before estimating model parameters, i.e., to select model topologies for acoustic models

  • For the purpose of overcoming the above mentioned disadvantages of the conventional acoustic model topology selection method, we proposed in our earlier work [1,2,3,4] two acoustic model topology optimization algorithms, dubbed Genetic Algorithm (GA)-AMTO and Particle Swarm Optimization (PSO)-AMTO respectively, for small or medium vocabulary speech recognition systems

  • We proposed in this paper two improved algorithms, called improved GA-AMTO algorithm and improved PSO-AMTO algorithm respectively, to optimize acoustic model topologies for large vocabulary speech recognition systems on the basis of our previous work to optimize acoustic model topologies for small or medium vocabulary speech recognition systems

Read more

Summary

Introduction

In constructing HMM (Hidden Markov Model) based speech recognition systems, we need to determine the number of states and the number of Gaussian kernels per state before estimating model parameters, i.e., to select model topologies for acoustic models. The rest of the paper is organized as follows: our previous work on acoustic model topology optimization is briefly introduced in section 2; section 3 describes the improved algorithms we newly presented; the experimental setup and results are described and given in section 4; section 5 contains our concluding remarks and future work introduction; the last section, section 6 is our acknowledgement remarks

Our previous work
Improved GA-AMTO and PSO-AMTO algorithms
Experimental setup and results
Summary and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call