Abstract

The Bag-of-Visual-Words (BoVW) approach has been attracted some attention in the field of keyword spotting. However, the BoVW approach discards the spatial relations of the visual words. Therefore, a visual language model is integrated into the BoVW framework in this study so as to add the spatial information. To accomplish the process of keyword spotting, two well-known retrieval schemes, including query likelihood model and KL divergence, have been adopted. The experimental results show that the visual language model can significantly improve the performance of keyword spotting on a collection of historical Mongolian document images than the original BoVW approach. Meanwhile, the influence of different codebook sizes on the performance has been analyzed in this paper. And the best appropriate size of the codebook has been determined.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call