Scene Text Research Articles

The advent of the study of Scene Text Detection and Recognition has exposed some significant challenges text recognition faces, such as blurred text detection. This study proposes a comparative model for detecting blurred text in wild scenes using independent component analysis (ICA) and enhanced genetic algorithm (E-GA) with support vector machine (SVM) and k-nearest neighbors (KNN) as classifiers. The proposed model aims to improve the accuracy of blurred text detection in challenging environments with complex backgrounds, noise, and illumination variations. The proposed model consists of three main stages: preprocessing, feature extraction, and classification. In the preprocessing stage, the input image is first preprocessed to remove noise and enhance edges using a median filter and a Sobel filter, respectively. Then, the blurred text regions are extracted using the Laplacian of Gaussian (LoG) filter. In the feature extraction stage, ICA is used to extract independent components from the blurred text regions. The extracted components are then fed into an E-GA-based feature selection algorithm to select the most discriminative features. The E-GA simply fine tunes the selection functionalities of the traditional GA using a bird approach. The selected features are then normalized and fed into the SVM and KNN classifiers. Experimental results on a benchmarking dataset (ICDAR 2019 LSVT) shows that the model outperforms state-of-the-art methods in terms of detection accuracy, precision, recall, and F1-score. The proposed model achieves an overall accuracy of 95.13% for SVM and 88.69% for KNN, which is significantly higher than the already existing methods which for SVM is 93%. In conclusion, the proposed model provides a promising approach for detecting blurred text in wild scenes. The combination of ICA, E-GA, and SVM/KNN classifiers enhances the robustness and accuracy of the detection system, which can be beneficial for a wide range of applications, such as text recognition, document analysis, and security systems.

End-to-end multilingual scene text spotting aims to integrate scene text detection and recognition into a unified framework. Actually, the accuracy of text recognition largely depends on the accuracy of text detection. Due to the lackage of benchmarks with adequate and high-quality character-level annotations for multilingual scene text spotting, most of the existing methods train on the benchmarks only with word-level annotations. However, the performance of multilingual scene text spotting are not that satisfied training on the existing benchmarks, especially for those images with special layout or words out of vocabulary. In this paper, we proposed a simple YOLO-like baseline named CMSTR for character-level multilingual scene text spotting simultaneously and efficiently. Technically, for each text instance, we represent the character sequence as ordered points and model them with learnable explicit point queries. After passing a single decoder, the point queries have encoded requisite text semantics and locations, thus can be further decoded to the center line, boundary, script, and confidence of text via very simple prediction heads in parallel. Furthermore, we show the surprisingly good extensibility of our method, in terms of character class, language type, and task. On the one hand, DeepSolo not only performs well in English scenes but also masters the Chinese transcription with complex font structure and a thousand-level character classes. On the other hand, based on the extensibility of DeepSolo, we launch DeepSolo++ for multilingual text spotting, making a further step to let Transformer decoder with explicit points solo for multilingual text detection, recognition, and script identification all at once.

Scene Text Research Articles

Related Topics

Articles published on Scene Text

DPNet: Scene text detection based on dual perspective CNN-transformer.

A New Contrastive Learning based Vision Transformer for Sentiment Analysis using Scene Text Images

A Comparative Model for Blurred Text Detection in Wild Scene Using Independent Component Analysis (ICA) and Enhanced Genetic Algorithm (Using a Bird Approach) with Classifiers

ACP-Net: Asymmetric Center Positioning Network for Real-Time Text Detection

Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction

GRNet: a graph reasoning network for enhanced multi-modal learning in scene text recognition

HPNet: Text Detection Network with Hybrid Attention and Pixel Aggregation for Irregularly-Shaped Nearby Texts

Visual place recognition from end-to-end semantic scene text features.

Improving Scene Text Retrieval via Stylized Middle Modality

A novel domain independent scene text localizer

Scene Chinese Recognition with Local and Global Attention

An End-to-End Scene Text Recognition for Bilingual Text

ESRNet: an exploring sample relationships network for arbitrary-shaped scene text detection

Rethinking Multilingual Scene Text Spotting: A Novel Benchmark and a Character-Level Feature Based Approach

Turning a CLIP Model Into a Scene Text Spotter.

Batch-transformer for scene text image super-resolution

Self-supervised memory learning for scene text image super-resolution

Rectification of Curved Scene Text Based on B-Spline Curve Fitting

DPGS: Cross-cooperation guided dynamic points generation for scene text spotting

Transformer-based end-to-end attack on text CAPTCHAs with triplet deep attention

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Scene Text Research Articles

Related Topics

Articles published on Scene Text

DPNet: Scene text detection based on dual perspective CNN-transformer.

A New Contrastive Learning based Vision Transformer for Sentiment Analysis using Scene Text Images

A Comparative Model for Blurred Text Detection in Wild Scene Using Independent Component Analysis (ICA) and Enhanced Genetic Algorithm (Using a Bird Approach) with Classifiers

ACP-Net: Asymmetric Center Positioning Network for Real-Time Text Detection

Soft set-based MSER end-to-end system for occluded scene text detection, recognition and prediction

GRNet: a graph reasoning network for enhanced multi-modal learning in scene text recognition

HPNet: Text Detection Network with Hybrid Attention and Pixel Aggregation for Irregularly-Shaped Nearby Texts

Visual place recognition from end-to-end semantic scene text features.

Improving Scene Text Retrieval via Stylized Middle Modality

A novel domain independent scene text localizer

Scene Chinese Recognition with Local and Global Attention

An End-to-End Scene Text Recognition for Bilingual Text

ESRNet: an exploring sample relationships network for arbitrary-shaped scene text detection

Rethinking Multilingual Scene Text Spotting: A Novel Benchmark and a Character-Level Feature Based Approach

Turning a CLIP Model Into a Scene Text Spotter.

Batch-transformer for scene text image super-resolution

Self-supervised memory learning for scene text image super-resolution

Rectification of Curved Scene Text Based on B-Spline Curve Fitting

DPGS: Cross-cooperation guided dynamic points generation for scene text spotting

Transformer-based end-to-end attack on text CAPTCHAs with triplet deep attention