Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition

S Prabu,K Joseph Abraham Sundar

doi:10.32604/iasc.2023.029105

Abstract

Recognizing irregular text in natural images is a challenging task in computer vision. The existing approaches still face difficulties in recognizing irregular text because of its diverse shapes. In this paper, we propose a simple yet powerful irregular text recognition framework based on an encoder-decoder architecture. The proposed framework is divided into four main modules. Firstly, in the image transformation module, a Thin Plate Spline (TPS) transformation is employed to transform the irregular text image into a readable text image. Secondly, we propose a novel Spatial Attention Module (SAM) to compel the model to concentrate on text regions and obtain enriched feature maps. Thirdly, a deep bi-directional long short-term memory (Bi-LSTM) network is used to make a contextual feature map out of a visual feature map generated from a Convolutional Neural Network (CNN). Finally, we propose a Dual Step Attention Mechanism (DSAM) integrated with the Connectionist Temporal Classification (CTC) - Attention decoder to re-weights visual features and focus on the intra-sequence relationships to generate a more accurate character sequence. The effectiveness of our proposed framework is verified through extensive experiments on various benchmarks datasets, such as SVT, ICDAR, CUTE80, and IIIT5k. The performance of the proposed text recognition framework is analyzed with the accuracy metric. Demonstrate that our proposed method outperforms the existing approaches on both regular and irregular text. Additionally, the robustness of our approach is evaluated using the grocery datasets, such as GroZi-120, WebMarket, SKU-110K, and Freiburg Groceries datasets that contain complex text images. Still, our framework produces superior performance on grocery datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Intelligent Automation & Soft Computing	Publication Date: Jan 1, 2023
Citations: 13	License type: cc-by

R Discovery Prime

R Discovery Prime

Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition

Abstract

Talk to us

Similar Papers

More From: Intelligent Automation & Soft Computing

Lead the way for us

Similar Papers

Learning to Read Irregular Text with Attention Mechanisms
Xiao Yang ... Daniel Kifer
-
Xiao Yang, et. al.Xiao Yang ... Daniel Kifer
01 Aug 2017
01 Aug 2017

Reading scene text with fully convolutional sequence modeling
Yunze Gao ... Hanqing Lu
Neurocomputing | VOL. 339
Yunze Gao, et. al.Yunze Gao ... Hanqing Lu
07 Feb 2019
Neurocomputing | VOL. 339

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification
Mahya Ameryan ... Lambert Schomaker
Neural Computing and Applications | VOL. 33
Mahya Ameryan, et. al.Mahya Ameryan ... Lambert Schomaker
01 Feb 2021
Neural Computing and Applications | VOL. 33

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition

Abstract

Talk to us

Similar Papers

More From: Intelligent Automation &amp; Soft Computing

More From: Intelligent Automation & Soft Computing