Abstract

This article presents an elegant technique for extracting the low-level stroke features, such as endpoints, junction points, line elements, and curve elements, from offline printed text using a template matching approach. The proposed features are used to classify a subset of characters from Gujarati script. The database consists of approximately 16,782 samples of 42 middle-zone symbols from the Gujarati character set collected from three different sources: machine printed books, newspapers, and laser printed documents. The purpose of this division is to add variety in terms of size, font type, style, ink variation, and boundary deformation. The experiments are performed on the database using a k-nearest neighbor (kNN) classifier and results are compared with other widely used structural features, namely Chain Codes (CC), Directional Element Features (DEF), and Histogram of Oriented Gradients (HoG). The results show that the features are quite robust against the variations and give comparable performance with other existing works.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call