Neural net based complete character recognition scheme for Bangla printed text books

Sk Alamgir Hossain ,Tamanna Tabassum

doi:10.1109/iccitechn.2014.6997336

Abstract

In this paper we propose a neural net based characters recognition scheme for Bangla printed text books. There are a lot of scientific literature, novels, magazines and books etc that are written in Bangla language. More than 400 million people use Bangla language. Most of the library and educational institutions want to keep copy of the books in a digital format. For storing those books in digital text format we need a good character recognition schemes by which we can convert the scanned text book images to editable texts. The key contribution of our research highlights this issue. We propose four main stages namely preprocessing, segmentation, training-recognition and post-processing. In the beginning the input book images preprocessed by rotation, scaling, binarization and noise elimination. The binarized image is then segmented and extracted into individual characters that are trained and recognized by an artificial neural network. Finally, the process ends by reconstructing the text in the post processing stage.

Full Text