Abstract

Computer Vision has become the poster child for Deep Learning. The image classification accuracy of convolutional neural nets on benchmark data sets has increased every year since their inception. This has been aided with advances in feature fusion. The increase in the availability of imagetext occurrence has lead to text augmented feature spaces that have lead to higher accuracy in in image classification tasks. However, these works are limited to instances where text is readily available. This study presents an approach to featurize text within natural images with the goal of augmenting image features for image classification tasks. Text extraction and featurization in natural images is a challenging task due to challenges in reliable text localization and OCR results, both being impeded by the variability in image text and errors in OCR. We overcome these challenges by implementing a novel bounding box concatenation algorithm and a novel feature boosting algorithm. The result is a pipeline that encodes an image into a text feature space. Classifiers trained on the text based feature space have comparable accuracy to the state of the art Convolutional Neural Nets (CNN's) while being significantly inexpensive computationally. Moreover, the augmentation of text features to image features generates a hybrid feature space with a higher information content for a classification problem when compared to a feature space comprised exclusively of image features. Thus, we see a rise in classification accuracy across all state of the art machine learning algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call