Abstract

In current era of technology, information acquisition from images and videos become most important task due to the rapid development of data mining and machine learning.The information can be either textual, visual, or combination of these. Text appearing in images or videos is a significant source of information and plays a vital role to perceive it. Developing a unified method to detect the text is hard, as textual properties (i.e. font, size, color, illumination, orientation, etc.) may vary with the complex background. So far, multimedia and computer vision community unable yet to standardize any ideal approach to extract the text smoothly. In this paper, a novel method is proposed to detect and localize artificial Urdu text in individual video frames. Firstly, Sobel and Canny edge detection operators are applied to input frame and are merged with MSER (Maximally Stable Extremal Region) detected regions. Next, geometric constraints are applied to eliminate obvious non-text regions with large and small variations. Further refining of non-text regions is achieved by stroke width transform. SVM (Support Vector Machine) classifier is trained to classify text and non-text objects. Finally, bounding boxes are used to localize the text.Experimental results show that the proposed method is robust and efficient than state-of-the-art methods.

Highlights

  • INTRODUCTIONReading the text and localizing it from videos and images became more popular and more challenging task since the last decade [1,2,3]

  • We evaluated the proposed approach on publicly accessible Artificial Urdu Text Dataset [25] and compared the results with state-of-the-art methods available for Urdu text.The dataset consists of 1000 individual video images which are captured from different Urdu TV channels (e.g. News, Sports, Business, Entertainment, and Religion)

  • We have investigated a robust approach to detect and localize artificial Urdu text in individual video frames

Read more

Summary

INTRODUCTION

Reading the text and localizing it from videos and images became more popular and more challenging task since the last decade [1,2,3]. Text extraction in video has recently gained much when it exists in the multilingual and complex consideration in multimedia understanding systems. Many researchers explored several methods for text detection and localization from videos in order to develop robust video retrieval systems [7,8,9,10] Most of these methods focused on English language and some methods for Chinese and other languages.

RELATED WORK
PROPOSED METHOD
Text Detection
Localization and Validation
Segmentation and Extraction
Character Classification
EXPERIMENTAL RESULTS
Dataset
Experimental Setup
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call