Abstract

Text detection in videos is challenging due to low resolution and complex background of videos. Besides, an arbitrary orientation of scene text lines in video makes the problem more complex and challenging. This paper presents a new method that extracts text lines of any orientations based on gradient vector flow (GVF) and neighbor component grouping. The GVF of edge pixels in the Sobel edge map of the input frame is explored to identify the dominant edge pixels which represent text components. The method extracts edge components corresponding to dominant pixels in the Sobel edge map, which we call text candidates (TC) of the text lines. We propose two grouping schemes. The first finds nearest neighbors based on geometrical properties of TC to group broken segments and neighboring characters which results in word patches. The end and junction points of skeleton of the word patches are considered to eliminate false positives, which output the candidate text components (CTC). The second is based on the direction and the size of the CTC to extract neighboring CTC and to restore missing CTC, which enables arbitrarily oriented text line detection in video frame. Experimental results on different datasets, including arbitrarily oriented text data, nonhorizontal and horizontal text data, Hua's data and ICDAR-03 data (camera images), show that the proposed method outperforms existing methods in terms of recall, precision and f-measure.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.