Word Image Representation Based on Visual Embeddings and Spatial Constraints for Keyword Spotting on Historical Documents

Hongxi Wei,Hui Zhang,Guanglai Gao

doi:10.1109/icpr.2018.8545573

Abstract

This paper proposed a visual embeddings approach to capturing semantic relatedness between visual words. To be specific, visual words are extracted and collected from a word image collection under the Bag-of-Visual-Words framework. And then, a deep learning procedure is used for mapping visual words into embedding vectors in a semantic space. To integrate spatial constraints into the representation of word images, one word image is segmented into several sub-regions with equal size along rows and columns. After that, each sub-region can be represented as an average of embedding vectors, which is the centroid of the embedding vectors of all visual words within the same sub-region. By this way, one word image can be converted into a fixed-length vector by concatenating the corresponding average embedding vectors from its all sub-regions. Euclidean distance can be calculated to measure similarity between word images. Experimental results demonstrate that the proposed representation approach outperforms Bag-of-Visual-Words, visual language model, spatial pyramid matching, latent Dirichlet allocation, average visual word embeddings and recurrent neural network.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Word Image Representation Based on Visual Embeddings and Spatial Constraints for Keyword Spotting on Historical Documents

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Using Word Mover’s Distance with Spatial Constraints for Measuring Similarity Between Mongolian Word Images
Hongxi Wei ... Xiangdong Su
-
Hongxi Wei, et. al.Hongxi Wei ... Xiangdong Su
01 Jan 2017
01 Jan 2017

Integrating Visual Word Embeddings into Translation Language Model for Keyword Spotting on Historical Mongolian Document Images
Hongxi Wei ... Guanglai Gao
-
Hongxi Wei, et. al.Hongxi Wei ... Guanglai Gao
01 Jan 2018
01 Jan 2018

Representing word image using visual word embeddings and RNN for keyword spotting on historical document images
Hongxi Wei ... Hui Zhang
-
Hongxi Wei, et. al.Hongxi Wei ... Hui Zhang
01 Jul 2017
01 Jul 2017

Word spotting and recognition via a joint deep embedding of image and text
Mohamed Mhiri ... Mohamed Cheriet
Pattern Recognition | VOL. 88
Mohamed Mhiri, et. al.Mohamed Mhiri ... Mohamed Cheriet
20 Nov 2018
Pattern Recognition | VOL. 88

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Word Image Representation Based on Visual Embeddings and Spatial Constraints for Keyword Spotting on Historical Documents

Abstract

Talk to us

Similar Papers