Scene Text Recognition via Dual-path Network with Shape-driven Attention Alignment

Yijie Hu,Qiu-Feng Wang,Wei Wang,Xiaowei Huang,Lei Ding,Bin Dong,Kaizhu Huang

doi:10.1145/3633517

Abstract

Scene text recognition (STR), one typical sequence-to-sequence problem, has drawn much attention recently in multimedia applications. To guarantee good performance, it is essential for STR to obtain aligned character-wise features from the whole-image feature maps. While most present works adopt fully data-driven attention-based alignment, such practice ignores specific character geometric information. In this article, built upon a group of learnable geometric points, we propose a novel shape-driven attention alignment method that is able to obtain character-wise features. Concretely, we first design a corner detector to generate a shape map to guide the attention alignments explicitly, where a series of points can be learned to represent character-wise features flexibly. We then propose a dual-path network with a mutual learning and cooperating strategy that successfully combines CNN with a ViT-based model, leading to further accuracy improvement. We conduct extensive experiments to evaluate the proposed method on various scene text benchmarks, including six popular regular and irregular datasets, two more challenging datasets (i.e., WordArt and OST), and three Chinese datasets. Experimental results indicate that our method can achieve superior performance with a comparable model size against many state-of-the-art models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scene Text Recognition via Dual-path Network with Shape-driven Attention Alignment

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Jan 11, 2024
Citations: 1

Similar Papers

Dictionary-guided Scene Text Recognition
Nguyen Nguyen ... Vinh Tran
-
Nguyen Nguyen, et. al.Nguyen Nguyen ... Vinh Tran
01 Jun 2021
01 Jun 2021

STV2k
Pingping Xiao ... Da-Han Wang
-
Pingping Xiao, et. al.Pingping Xiao ... Da-Han Wang
19 Aug 2016
19 Aug 2016

Occluded Text Detection and Recognition in the Wild
Zobeir Raisi ... John Zelek
-
Zobeir Raisi, et. al.Zobeir Raisi ... John Zelek
01 May 2022
01 May 2022

Scene Character and Text Recognition: The State-of-the-Art
Chongmu Chen ... Hanzi Wang
-
Chongmu Chen, et. al.Chongmu Chen ... Hanzi Wang
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scene Text Recognition via Dual-path Network with Shape-driven Attention Alignment

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications