Abstract

Document image analysis plays a vital role in this digital era. Recent developments in the IT industry have led to the growth of digital data in various fields like medical, government offices, education sector, banks, social media, digital library, and so on. Advancement in the recent technologies has paved their way to convert the traditional offices into paperless offices. Also, the growth of digital libraries, e-governance, and internet based applications has led to the increase in the volume of digital data, which mainly include texts, graphs, images, audio and video as various components in the document image by resulting in the development of complex document images, which are used for archival and transmission on regular basis. This paper proposes an idea for processing the document image in its compressed version by particularly focusing on how content matching and structural analysis can be performed in the compressed representation of document image. This gives an insight on the importance of processing document images in its compressed domain. Due to the exponential growth of data, the data is stored in compressed form. There is an actual need for investigating further research from the perspective of dealing directly with the compressed representation of document images as a remedy to the ever-increasing big data-related challenges. This paper also discusses the various applications of document images and opens up the challenges faced by the researchers in addressing these applications. An overview of the state of the art datasets available in the literature in the area of document image analysis is also addressed

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.