Abstract

Multi-script identification helps in automatically selecting an appropriate OCR when video has several scripts; however, script identification in video frames is challenging because low resolution and complex background of video often cause disconnections or the loss of text information. This paper presents a novel idea that integrates the Gradient-Spatial-Features (GSpF) and the Gradient-Structural-Features (GStF) at block level based on an error factor and the weights of the features to identify six video scripts, namely, Arabic, Chinese, English, Japanese, Korean and Tamil. Horizontal and vertical gradient values are first computed for each text block to increase the contrast of text pixels. Then the method divides the horizontal and the vertical gradient blocks into two equal parts at the centroid in the horizontal direction. Histogram operation on each part is performed to select dominant text pixels from respective subparts of the horizontal and the vertical gradient blocks, which results in text components. After extracting GSpF and GStF from the text components, we finally propose to integrate the spatial and the structural features based on end points, intersection points, junction points and straightness of the skeleton of text components in a novel way to identify the scripts. The method is evaluated on 970 video frames of six scripts which involves font, font size or contrast variations, and is compared with an existing method in terms of classification rate. Experimental results show that the proposed method achieves 83.0% average classification rate for video script identification. The method is also evaluated by testing on noisy images and scanned low resolution documents, illustrating the robustness and the extensibility of the proposed Gradient-Spatial-Structural Features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call