Abstract

Multioriented text detection and recognition in natural scene images are still challenges in the document analysis and computer vision communities. In particular, character segmentation plays an important role in the complete end-to-end recognition system performance. In this work, a robust multioriented text detection and segmentation method based on a biological visual system model is proposed. The proposed method exploits the local energy model instead of a common approach based on variations of local image pixel intensities. Features such as lines and edges are obtained by searching for the maximum local energy utilizing the scale-space monogenic signal framework. The candidate text components are extracted from maximally stable extremal regions of the local phase information of the image. The candidate regions are filtered by their phase congruency and classified as text and nontext components by the AdaBoost classifier. Finally, misclassified characters are restored, and all final characters are grouped into words. Experimental results show that the proposed text detection and segmentation method is invariant to scale and rotation changes and robust to perspective distortions, blurring, low resolution, and illumination variations (low contrast, high brightness, shadows, and nonuniform illumination). Besides, the proposed method achieves often a better performance compared with state-of-the-art methods on typical natural scene datasets.

Highlights

  • Nowadays, imagery has become an indispensable source of human communication and interaction

  • Two evaluation types are selected for text segmentation and text localization

  • For character candidate generation evaluation, the recall-similarity rate is utilized. e recall-similarity is defined as the ratio between the total correctly detected candidate regions and the ground truth characters

Read more

Summary

Introduction

Imagery has become an indispensable source of human communication and interaction. Digital images with textual content provide useful information for tasks related to document classification, multimedia retrieval, language translator, text to voice converter, robotic navigation, and augmented reality, to name a few [1, 2]. E analysis of this textual information involves basically three stages: text detection, character segmentation, and word recognition. E fundamental goal of text detection is to determine whether there is text in a given image, while character segmentation considers the extraction and localization of characters from background pixels. Character segmentation, and word recognition stages are not necessarily applied in a specific order, the character segmentation as the first stage could provide a better performance for the following processes. Text localization and character segmentation are still challenges in the document analysis and computer vision communities (http://rrc.cvc.uab.es/?com=introduction). Natural scenes are commonly captured under uncontrolled conditions (illumination changes, partial occlusion, low resolution, sensor noise, blur, and alignment) and could contain complex backgrounds (people, buildings, fences, bricks, grass, trees, and cars) [1,2,3]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call