Abstract

Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.