Abstract

Hundreds of text detection methods have been proposed, motivated by their widespread use in several applications. Despite the huge progress in the area, which includes even the use of sophisticated learning schemes, ad-hoc post-processing procedures are often employed to improve the text detection rate, by removing both false positives and negatives. Another issue refers to the lack of the use of the complementary views provided by different text detection methods. This paper aims to fill these gaps. We propose the use of a soft computing framework, based on genetic programming (GP), to guide the definition of suitable post-processing procedures through the combination of basic operators, which may be applied to improve detection results provided by multiple methods at the same time. Performed experiments in the widely used ICDAR 2011, ICDAR 2013, and ICDAR 2015 datasets demonstrate that our GP-based approach leads to F1 effectiveness gains up to 5.1 percentage points, when compared to several baselines.

Highlights

  • Texts are essential elements for effective communication in our daily life

  • We select from literature effective text localization methods based on classical machine learning techniques such as Scene Text Recognition [13], SnooperText [10], and MSER-SWT Text Detection [26], [27], hereinafter, referred to non-deep learning methods

  • For the experiments related to nonrestrictive computing scenario, we select two effective and efficient methods based on Convolutional Neural Network (CNN), the TextBoxes++ [7], Pelee-Text [9], and PixelLink [8] methods

Read more

Summary

Introduction

Texts are essential elements for effective communication in our daily life. Texts and words are everywhere, being used to guide us in specific activities or even to label objects. In both scenarios, textual elements can play an important role in the semantic understanding of scenes. In several computer vision tasks, the understanding of textual elements in a scene may be paramount for machines to be able to recognize important events in multimedia data. Several researchers are striving towards devising applications that aim at understanding textual elements present in scenes [1]–[3]. Different from the classic optical character recognition problem, the task of localizing and recognizing text in real

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.