Document image retrieval based on multi-density features

Zhilan Hu,Xinggang Lin,Hong Yan

doi:10.1007/s11460-007-0032-9

Abstract

The development of document image databases is becoming a challenge for document image retrieval techniques. Traditional layout-reconstructed-based methods rely on high quality document images as well as an optical character recognition (OCR) precision, and can only deal with several widely used languages. The complexity of document layouts greatly hinders layout analysis-based approaches. This paper describes a multi-density feature based algorithm for binary document images, which is independent of OCR or layout analyses. The text area was extracted after preprocessing such as skew correction and marginal noise removal. Then the aspect ratio and multi-density features were extracted from the text area to select the best candidates from the document image database. Experimental results show that this approach is simple with loss rates less than 3% and can efficiently analyze images with different resolutions and different input systems. The system is also robust to noise due to its notes and complex layouts, etc.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Document image retrieval based on multi-density features

Abstract

Talk to us

Similar Papers

More From: Frontiers of Electrical and Electronic Engineering in China

Lead the way for us

Journal: Frontiers of Electrical and Electronic Engineering in China	Publication Date: Apr 1, 2007
Citations: 1

Similar Papers

Document image retrieval based on density distribution feature and key block feature
Hong Liu ... Suoqian Feng
-
Hong Liu, et. al. Hong Liu ... Suoqian Feng
01 Jan 2004
01 Jan 2004

Texture Feature-based Document Image Retrieval

-

01 Jul 2019
01 Jul 2019

Keyword Spotting in Document Images through Word Shape Coding
Shuyong Bai ... Chew Lim Tan
-
Shuyong Bai, et. al.Shuyong Bai ... Chew Lim Tan
01 Jan 2009
01 Jan 2009

Document Image Retrieval through Word Shape Coding
Shijian Lu ... Linlin Li
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 30
Shijian Lu, et. al. Shijian Lu ... Linlin Li
01 Nov 2008
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Document image retrieval based on multi-density features

Abstract

Talk to us

Similar Papers

More From: Frontiers of Electrical and Electronic Engineering in China