Abstract

In this article, we propose a novel stroke width transform (SWT) voting-based color reduction method for detecting text in natural scene images. Unlike other text detection approaches that mostly rely on either text structure or color, the proposed method combines both by supervising text-oriented color reduction process with additional SWT information. SWT pixels mapped to color space vote in favor of the color they correspond to. Colors receiving high SWT vote most likely belong to text areas and are blocked from being mean-shifted away. Literature does not explicitly address SWT search direction issue; thus, we propose an adaptive sub-block method for determining correct SWT direction. Both SWT voting-based color reduction and SWT direction determination methods are evaluated on binary (text/non-text) images obtained from a challenging Computer Vision Lab optical character recognition database. SWT voting-based color reduction method outperforms the state-of-the-art text-oriented color reduction approach.

Highlights

  • Text detection in natural scene images is a very challenging task, far from being completely solved

  • 5 Experimental results Since the popular text detection datasets such as International Conference on Document Analysis and Recognition (ICDAR) [15,16] and Street View Text (SVT) [21] are annotated with word rectangles they are inappropriate for evaluating our color reduction method, which covers the first two stages of the text detection flowchart (Figure 1)

  • We evaluated both stroke width transform (SWT) direction determination and SWT voting-based color reduction methods on CVL OCR BIN DB

Read more

Summary

Introduction

Text detection in natural scene images is a very challenging task, far from being completely solved. Uneven illumination, and presence of almost unlimited number of text fonts, sizes, and orientations pose great difficulties even to state-of-the-art text detection methods. Unlike document images, where text is usually superimposed on either blank or complex backgrounds and is more distinct [1,2,3], natural scene images deal with scene text, which is already a part of the captured scene and is often much less distinct. State-of-the-art literature distinguishes between two major text detection approaches: texture-based and region-based. Texture-based methods [4,5,6,7] scan images at different scales, inspect area under the sliding window

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.