Reading scene text with fully convolutional sequence modeling

Yunze Gao,Yingying Chen,Jinqiao Wang,Ming Tang,Hanqing Lu

doi:10.1016/j.neucom.2019.01.094

Abstract

Reading text in the wild is a challenging task in computer vision. Existing approaches mainly adopt connectionist temporal classification (CTC) or attention models based on recurrent neural network (RNN), and are computationally expensive and hard to train. In this paper, instead of the chain structure of RNN, we propose an end-to-end fully convolutional network with the stacked convolutional layers to effectively capture the long-term dependencies among elements of scene text image. The stacked convolutional layers are much more efficient than bidirectional long short-term memory (BLSTM) in modeling the contextual dependency. In addition, we design a discriminative feature encoder by incorporating the residual attention blocks into a small densely connected network to enhance the foreground text and suppress the background noise. Extensive experiments on seven standard benchmarks, the Street View Text, IIIT5K, ICDAR03, ICDAR13, ICDAR15, COCO-Text and Total-Text, validate that our method not only achieves state-of-the-art or highly competitive recognition performance, but significantly improves the efficiency and reduces the number of parameters as well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reading scene text with fully convolutional sequence modeling

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Feb 7, 2019
Citations: 66

Similar Papers

Dense Chained Attention Network for Scene Text Recognition
Yunze Gao ... Jinqiao Wang
-
Yunze Gao, et. al.Yunze Gao ... Jinqiao Wang
01 Oct 2018
01 Oct 2018

Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
S Prabu ... K Joseph Abraham Sundar
Intelligent Automation & Soft Computing | VOL. 35
S Prabu, et. al.S Prabu ... K Joseph Abraham Sundar
01 Jan 2023
Intelligent Automation & Soft Computing | VOL. 35

Modeling intra-label dynamics in connectionist temporal classification
Ashkan Sadeghi Lotfabadi ... Kamaledin Ghiasi-Shirazi
-
Ashkan Sadeghi Lotfabadi, et. al.Ashkan Sadeghi Lotfabadi ... Kamaledin Ghiasi-Shirazi
01 Oct 2017
01 Oct 2017

Dense Model for Automatic Image Description Generation with Game Theoretic Optimization
Sreela S R ... Sumam Mary Idicula
Information | VOL. 10
Sreela S R, et. al.Sreela S R ... Sumam Mary Idicula
15 Nov 2019
Information | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reading scene text with fully convolutional sequence modeling

Abstract

Talk to us

Similar Papers

More From: Neurocomputing