Abstract

We propose a method to extract special objects in images of medieval books, which generally represent, for example, figures and capital letters. Instead of working on the single-pixel level, we consider superpixels as the basic classification units for improved time efficiency. More specifically, we classify superpixels into different categories/objects by using a bag-of-features approach, where a superpixel category classifier is trained with the local features of the superpixels of the training images. With the trained classifier, we are able to assign the category labels to the superpixels of a historical document image under test. Finally, special objects can easily be identified and extracted after analyzing the categorization results. Experimental results demonstrate that, as compared to the state-of-the-art algorithms, our method provides comparable performance for some historical books but greatly outperforms them in terms of generality and computational time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call