Abstract

Text in natural images is an important source of information, which can be utilized for many real-world applications. This work focuses on a new problem: distinguishing images that contain text from a large volume of natural images. To address this problem, we propose a novel convolutional neural network variant, called multi-scale spatial partition network (MSP-Net). The network classifies images that contain text or not, by predicting text existence in all image blocks, which are spatial partitions at multiple scales on an input image. The whole image is classified as a text image (an image containing text) as long as one of the blocks is predicted to contain text. The network classifies images very efficiently by predicting all blocks simultaneously in a single forward propagation. Through experimental evaluations and comparisons on public datasets, we demonstrate the effectiveness and robustness of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call