Sequential alignment attention model for scene text recognition

Yan Wu,Jiaxin Fan,Renshuai Tao,Jiakai Wang,Haotong Qin,Aishan Liu,Xianglong Liu

doi:10.1016/j.jvcir.2021.103289

Abstract

Scene text recognition has been a hot research topic in computer vision due to its various applications. The state-of-the-art solutions usually depend on the attention-based encoder-decoder framework that learns the mapping between input images and output sequences in a purely data-driven way. Unfortunately, there often exists severe misalignment between feature areas and text labels in real-world scenarios. To address this problem, this paper proposes a sequential alignment attention model to enhance the alignment between input images and output character sequences. In this model, an attention gated recurrent unit (AGRU) is first devised to distinguish the text and background regions, and further extract the localized features focusing on sequential text regions. Furthermore, CTC guided decoding strategy is integrated into the popular attention-based decoder, which not only helps to boost the convergence of the training but also enhances the well-aligned sequence recognition. Extensive experiments on various benchmarks, including the IIIT5k, SVT, and ICDAR datasets, show that our method substantially outperforms the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sequential alignment attention model for scene text recognition

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Journal: Journal of Visual Communication and Image Representation	Publication Date: Aug 30, 2021
Citations: 6

Similar Papers

Adaptive embedding gate for attention-based scene text recognition
Xiaoxue Chen ... Canjie Luo
Neurocomputing | VOL. 381
Xiaoxue Chen, et. al.Xiaoxue Chen ... Canjie Luo
04 Dec 2019
Neurocomputing | VOL. 381

Focusing Attention: Towards Accurate Text Recognition in Natural Images
Zhanzhan Cheng ... Fan Bai
-
Zhanzhan Cheng, et. al.Zhanzhan Cheng ... Fan Bai
01 Oct 2017
01 Oct 2017

How Far Deep Learning Systems for Text Detection and Recognition in Natural Scenes are Affected by Occlusion?
Aline Geovanna Soares ... Byron Leite Dantas Bezerra
-
Aline Geovanna Soares, et. al.Aline Geovanna Soares ... Byron Leite Dantas Bezerra
01 Jan 2020
01 Jan 2020

SVTR: Scene Text Recognition with a Single Visual Model
Yongkun Du ... Caiyan Jia
-
Yongkun Du, et. al.Yongkun Du ... Caiyan Jia
01 Jul 2022
01 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sequential alignment attention model for scene text recognition

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation