Abstract

This study presents a novel approach to spoken document retrieval based on neural probabilistic language modeling for semantic inference. The neural network based language model is applied to estimate word association in a continuous space. The different kinds of weighting schemes are investigated to represent recognized words of a spoken document into an indexing vector. The indexing vector is transferred into the semantic indexing vector through the neural probabilistic language model. Such a semantic word inference and re-weighting make the semantic indexing vector a suitable representation for speech indexing. Experimental results conducted on Mandarin Chinese broadcast news show that the proposed approach can achieve a substantial and consistent improvement of spoken document retrieval.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call