Abstract

Extraction of the text data present in images involves Text detection, Text localization, Text tracking, Text extraction, Text Enhancement and Text Recognition. Due to its inherent complexity, traditional text localization algorithms in natural scenes, especially in multi-context scenes, are not implementable under low computational resources architectures such as mobile phones. In this paper, we proposed a simple method to automatically localize signboard texts within JPEG mobile phone camera images. Taking into account the information provided by the Discrete Cosine Transform (DCT) used by the JPEG compression format, we delimitate the borders of the most important text region. This system is simple, reliable, affordable, easily implementable, and quick even working under architectures with low computational resources.

Highlights

  • Text Information Extraction (TIE) is a well differentiated branch on the Pattern Recognition area

  • The processing techniques needed for TIE systems to overcome these difficulties are usually extremely computationally expensive, so its implementation on HID, which have very low computational resources, is often unfeasible

  • Since the computational time is a major issue on this devices, and taking into account that most of the images are stored using a JPEG compressed format [2], we have designed our system to localize the text by using the DCT (Discrete Cosine Transform) coefficients [3]-[5], avoiding the computationally expensive process of uncompressing the image

Read more

Summary

Introduction

Text Information Extraction (TIE) is a well differentiated branch on the Pattern Recognition area. TIE was focused on the analysis of scanned documents, which provided a pseudo-ideal scenario: high resolution, minimal character shape distortion, even and adequate illumination, clear, simple and known backgrounds, minimal blur, and so on. This constrained scenario was insufficient for the development of useful applications for general. These devices are small, light, portable, cheap, integral with networks, and can capture any image or video in any scenario As a result, they build a huge market niche, either standalone or embedded on other devices, such as mobile phones.

Background for Text Localization
Proposed Algorithm
Verification of the Methodology
Experimental Test and Results
Text data in different angles under normal conditions
Normal Condition:
Commercial Sign Board Images
Dark Colored Background Images
Multi Text Images
ICDAR 2003 Images Dataset
Findings
Summaries and Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.