Scene text detection via extremal region based double threshold convolutional network classification.

Wei Zhu,Longtao Chen,Jing Lou,Qingyuan Xia,Mingwu Ren

doi:10.1371/journal.pone.0182227

Wei Zhu, Longtao Chen + Show 3 more

Open Access

https://doi.org/10.1371/journal.pone.0182227

Copy DOI

Journal: PloS one	Publication Date: Aug 18, 2017
Citations: 9	License type: CC BY 4.0

Affiliation: Nanjing University of Science and Technology

Abstract

In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.

Highlights

Reading text in the wild is significant in a variety of advanced computer vision applications, such as image and video retrieval, scene understanding and visual assistance, since text in images usually conveys valuable information
We propose a robust approach which combines the advantages of both Maximally Stable Extremal Region (MSER) and convolutional neural network (CNN) feature representations
We evaluated the proposed method on two widely cited datasets for benchmarking scene text detection: ICDAR 2011 RRC dataset [53], and ICDAR 2013 RRC dataset [17]

Summary

Introduction

Reading text in the wild is significant in a variety of advanced computer vision applications, such as image and video retrieval, scene understanding and visual assistance, since text in images usually conveys valuable information. Detection and recognizing text in scene images has received increasing attention in this community. Though extensively studied in recent years, text detection in unconstrained environments is still quite challenging due to a number of factors, such as high variation in character font, size, color, orientation as well as complicated background and non-uniform illumination. Previous works for scene text detection based on sliding windows [1,2,3,4,5] and connected component analysis [6,7,8,9,10,11,12,13,14] have become mainstream in this domain. Sliding windows based methods localize text regions by shifting a multi-scaled classification window.

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scene text detection via extremal region based double threshold convolutional network classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

A hierarchical recursive method for text detection in natural scene images
Xiaobing Wang ... Yonghong Song
Multimedia Tools and Applications | VOL. 76
Xiaobing Wang, et. al.Xiaobing Wang ... Yonghong Song
12 Dec 2016
Multimedia Tools and Applications | VOL. 76

Reduced annotation based on deep active learning for arabic text detection in natural scene images
Khalil Boukthir ... Adel M Alimi
Pattern Recognition Letters | VOL. 157
Khalil Boukthir, et. al.Khalil Boukthir ... Adel M Alimi
01 May 2022
Pattern Recognition Letters | VOL. 157

Could scene context be beneficial for scene text detection?
Anna Zhu ... Seiichi Uchida
Pattern Recognition | VOL. 58
Anna Zhu, et. al.Anna Zhu ... Seiichi Uchida
26 Apr 2016
Pattern Recognition | VOL. 58

Occluded Text Detection and Recognition in the Wild
Zobeir Raisi ... John Zelek
-
Zobeir Raisi, et. al.Zobeir Raisi ... John Zelek
01 May 2022
01 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scene text detection via extremal region based double threshold convolutional network classification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one