Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

Xiabi Liu,Hui Fu,Yunde Jia

doi:10.1016/j.patcog.2007.06.004

Abstract

This paper proposes an approach based on the statistical modeling and learning of neighboring characters to extract multilingual texts in images. The case of three neighboring characters is represented as the Gaussian mixture model and discriminated from other cases by the corresponding ‘pseudo-probability’ defined under Bayes framework. Based on this modeling, text extraction is completed through labeling each connected component in the binary image as character or non-character according to its neighbors, where a mathematical morphology based method is introduced to detect and connect the separated parts of each character, and a Voronoi partition based method is advised to establish the neighborhoods of connected components. We further present a discriminative training algorithm based on the maximum–minimum similarity (MMS) criterion to estimate the parameters in the proposed text extraction approach. Experimental results in Chinese and English text extraction demonstrate the effectiveness of our approach trained with the MMS algorithm, which achieved the precision rate of 93.56% and the recall rate of 98.55% for the test data set. In the experiments, we also show that the MMS provides significant improvement of overall performance, compared with influential training criterions of the maximum likelihood (ML) and the maximum classification error (MCE).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Journal: Pattern Recognition	Publication Date: Jun 20, 2007
Citations: 44

Similar Papers

Gaussian Mixture Modeling of Neighbor Characters for Multilingual Text Extraction in Images
Hui Fu ... Xiabi Liu
-
Hui Fu, et. al.Hui Fu ... Xiabi Liu
01 Oct 2006
01 Oct 2006

A comprehensive method for multilingual video text detection, localization, and extraction
M.R Lyu ... Min Cai
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 15
M.R Lyu, et. al.M.R Lyu ... Min Cai
01 Feb 2005
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 15

Text extraction from images using gamma correction method and different text extraction methods — A comparative analysis
G Gayathri Devi ... C P Sumathi
-
G Gayathri Devi, et. al.G Gayathri Devi ... C P Sumathi
01 Feb 2014
01 Feb 2014

Gaussian Mixture Models with Uncertain Parameters
Jia Zeng ... Lei Xie
-
Jia Zeng, et. al.Jia Zeng ... Lei Xie
01 Jan 2007
01 Jan 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition