Malayalam text and non-text classification of natural scene images based on multiple instance learning

Anit V Manjaly,B Shanmuga Priya

doi:10.1109/icaca.2016.7887949

Abstract

Information is one of the foremost fact in the prompt world. Within that, text information plays an imperative role and can acquire diverse mold. The natural images that consist of such text information are called scene text images. Semantic information of the image is used for content-based image retrieval, indexing and classification purpose. First stage of text extraction is the text and non-text classification that determines the presence of the text in an image. Compared to English language, determining the presence of Malayalam text in a scene image is more difficult due to its agglutinative nature. In this paper, the proposed work classifies natural scene images into Malayalam text and non-text images using Multiple Instance Learning (MIL). Our own dataset that contains natural scene images with Malayalam text and non-text images are used for the performance evaluation. Analysis is done in terms of precision, recall, F1 score, accuracy rates and has a promising experimental result.

Full Text