Arabic word descriptor for handwritten word indexing and lexicon reduction

Youssouf Chherawala,Mohamed Cheriet

doi:10.1016/j.patcog.2014.04.025

Abstract

Word recognition systems use a lexicon to guide the recognition process in order to improve the recognition rate. However, as the lexicon grows, the computation time increases. In this paper, we present the Arabic word descriptor (AWD) for Arabic word shape indexing and lexicon reduction in handwritten documents. It is formed in two stages. First, the structural descriptor (SD) is computed for each connected component (CC) of the word image. It describes the CC shape using the bag-of-words model, where each visual word represents a different local shape structure, extracted from the image with filters of different patterns and scales. Then, the AWD is formed by sorting and normalizing the SDs. This emphasizes the symbolic features of Arabic words, such as subwords and diacritics, without performing layout segmentation. In the context of lexicon reduction, the AWD is used to index a reference database. Given a query image, the reduced lexicon is obtained from the labels of the first entries in the indexed database. This framework has been tested on Arabic word databases. It has a low computational overhead, while providing a compact descriptor, with state-of-the-art results for lexicon reduction on the Ibn Sina and IFN/ENIT databases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Arabic word descriptor for handwritten word indexing and lexicon reduction

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Journal: Pattern Recognition	Publication Date: May 14, 2014
Citations: 11

Similar Papers

Extraction of Arabic Words from Complex Color Image
R Fathalla ... Y El Sonbaty
-
R Fathalla, et. al.R Fathalla ... Y El Sonbaty
01 Sep 2007
01 Sep 2007

Text line extraction of curved document images using hybrid metric
Zuming Huang ... Jie Gu
-
Zuming Huang, et. al.Zuming Huang ... Jie Gu
01 Nov 2015
01 Nov 2015

Recognition of Handwritten Arabic Literal Amounts Using a Hybrid Approach
Abdelhak Boukharouba ... Abdelhak Bennia
Cognitive Computation | VOL. 3
Abdelhak Boukharouba, et. al.Abdelhak Boukharouba ... Abdelhak Bennia
24 Dec 2010
Cognitive Computation | VOL. 3

TWO-STAGE LEXICON REDUCTION FOR OFFLINE ARABIC HANDWRITTEN WORD RECOGNITION
Saeed Mozaffari ... Volker Märgner
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 22
Saeed Mozaffari, et. al.Saeed Mozaffari ... Volker Märgner
01 Nov 2008
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Arabic word descriptor for handwritten word indexing and lexicon reduction

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition