Feature string-based intelligent information retrieval from Tamil document images

S Abirami,D Manjula

doi:10.1504/ijcat.2009.026592

Abstract

Information Retrieval (IR) in document images has become a growing and challenging problem due to its rising popularity. This paper proposes a simple and effective method to extract the text and perform intelligent IR from Tamil Document Images without Optical Character Recognition (OCR). This methodology generates a feature string for every word image by extracting its features. This relies on their basic characteristics or shapes of letters instead of recognising the letters like OCR. The strength of this technique lies in extracting the text based on their basic features such as lines and black and white disposition rates in characters which is almost same for the characters across various font sizes and font faces. As an offline process, document images are preprocessed and text extraction process extracts the features from the word images based on their shapes and they are stored in temporary files. During online retrieval, textual keyword is obtained from the user and its primitive string is framed. Based on the primitive string, IR is performed and the resultant images are provided to the user. This technique could be easily adopted in large digital libraries for IR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Feature string-based intelligent information retrieval from Tamil document images

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications in Technology

Lead the way for us

Journal: International Journal of Computer Applications in Technology	Publication Date: Jan 1, 2009
Citations: 16

Similar Papers

Information retrieval in document image databases
Yje Lu ... Chew Lim Tan
IEEE Transactions on Knowledge and Data Engineering | VOL. 16
Yje Lu, et. al. Yje Lu ... Chew Lim Tan
01 Nov 2004
IEEE Transactions on Knowledge and Data Engineering | VOL. 16

Retrieval Of Information In Document Image Databases Using Partial Word Image Matching Technique
Seema Yadav ... Sudhir Sawarkar
-
Seema Yadav, et. al.Seema Yadav ... Sudhir Sawarkar
01 Mar 2009
01 Mar 2009

Profile Based Information Retrieval from Printed Document Images
S Abirami ... D Manjula
-
S Abirami, et. al.S Abirami ... D Manjula
01 Aug 2007
01 Aug 2007

Towards Removal of Shadows Caused due to Object Interferences in Smartphone Captured Document Images using Multiple Mask Generation Technique
Koushik K.S ... N Shobha Rani
Procedia Computer Science | VOL. 235
Koushik K.S, et. al.Koushik K.S ... N Shobha Rani
01 Jan 2024
Procedia Computer Science | VOL. 235

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Feature string-based intelligent information retrieval from Tamil document images

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications in Technology