Intelligent Detection Method of English Text in Natural Scenes in Video

Liqin Dai,Chunhua Chen

doi:10.1155/2021/6239112

Liqin Dai, Chunhua Chen

Open Access

PDF Available

https://doi.org/10.1155/2021/6239112

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

With the rapid development of Internet technology, breakthroughs have been made in all branches of computer vision. Especially in image detection and target tracking, deep learning techniques such as convolutional neural networks have achieved excellent results. In order to explore the applicability of machine learning technology in the field of video text recognition and extraction, a YOLOv3 network based on multiscale feature transformation and migration fusion is proposed to improve the accuracy of english text detection in natural scenes in video. Firstly, aiming at the problem of multiscale target detection in video key frames, based on the YOLOv3 network, the scale conversion module of STDN algorithm is used to reduce the low-level feature map, and a backbone network with feature reuse is constructed to extract features. Then, the scale conversion module is used to enlarge the high-level feature map, and a feature pyramid network (FPN) is built to predict the target. Finally, the improved YOLOv3 network is verified to extract key text from images. The experimental results show that the improved YOLOv3 network can effectively improve the false detection and missed detection caused by occlusion and small target, and the accuracy of English text extraction is obviously improved.

Highlights

As the mainstream part of today’s media industry, images and videos are rich in information and easy to understand, which makes them an indispensable part of life
Conventional object detection methods in the visual field (SSD, You Only Look Once (YOLO), faster-RCNN, etc.) are not ideal when directly applied to English text detection tasks. e main reasons are as follows: compared with conventional objects, the length of text lines and the ratio of length to width vary widely. erefore, after analyzing YOLO series networks, we propose a new multiscale feature fusion method, which improves the performance of YOLOv3 networks
In order to construct features with multiscale characteristics and rich expressive ability, we introduce a feature scale transformation and migration fusion method to improve the traditional YOLOv3 network

Summary

Introduction

As the mainstream part of today’s media industry, images and videos are rich in information and easy to understand, which makes them an indispensable part of life. Character recognition has great application value in many scenes, such as vehicle license plate detection, image-text conversion, image content translation, and image search. Because the precision of text recognition technology is not ideal, its application scenarios are relatively simple, such as content search in images [1,2,3,4,5,6]. The characters are easy to extract and have strong descriptive ability, so how to understand the semantic information of characters in images is an urgent problem to be solved. E final result of the recognition system can only be determined if the images and characters have good detection performance, so the text detection is the key research content of this paper One kind of text is the text in the natural scene of the image [13,14,15], such as the license plate number and bus stop sign text in the image [16], and the other kind of text is artificially added, such as movie subtitles, advertising information, and medical image analysis text [17]. erefore, all the words should be extracted except the repeated words within a short time delay. e final result of the recognition system can only be determined if the images and characters have good detection performance, so the text detection is the key research content of this paper

Literature Review

English Text Detection Based on Improved YOLOv3 Network

H W r r rH rW

Experimental Results and Analysis

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Intelligent Detection Method of English Text in Natural Scenes in Video

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Scientific Programming

Lead the way for us

Journal: Scientific Programming	Publication Date: Nov 23, 2021
License type: CC BY 4.0

Similar Papers

Text detection in natural scene with edge analysis
Quan Meng ... Yonghong Song
-
Quan Meng, et. al.Quan Meng ... Yonghong Song
01 Sep 2013
01 Sep 2013

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation.
Xiqi Wang ... Rui Li
Sensors | VOL. 21
Xiqi Wang, et. al.Xiqi Wang ... Rui Li
28 Jan 2021
Sensors | VOL. 21

Natural scene text detection algorithm based on the regional proposal
Guangyue Wu ... Zibo Yang
Journal of Physics: Conference Series | VOL. 2562
Guangyue Wu, et. al.Guangyue Wu ... Zibo Yang
01 Aug 2023
Journal of Physics: Conference Series | VOL. 2562

A hybrid approach for text detection in natural scenes
Runmin Wang ... Xiaoqin Kuang
-
Runmin Wang, et. al.Runmin Wang ... Xiaoqin Kuang
27 Oct 2013
27 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Intelligent Detection Method of English Text in Natural Scenes in Video

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Scientific Programming