Abstract

This research article proposes a new handwritten Malayalam character recognition model based on AlexNet based architecture. The Malayalam language consists of a variety of characters having similar features, thus, differentiating characters is a challenging task. A lot of handcrafted feature extraction methods have been used for the classification of Malayalam characters. Convolutional Neural Networks (CNN) is one of the popular methods used in image and language recognition. AlexNet based CNN is proposed for feature extraction of basic and compound Malayalam characters. Furthermore, Support Vector Machine (SVM) is used for classification of the Malayalam characters. The 44 primary and 36 compound Malayalam characters are recognised with better accuracy and achieved minimal time consumption using this model. A dataset consisting of about 180,000 characters is used for training and testing purposes. This proposed model produces an efficiency of 98% with the dataset. Further, a dataset for Malayalam characters is developed in this research work and shared on Internet

Highlights

  • Optical Character Recognition (OCR) is the process of recognising handwritten text and printed text

  • The Convolutional Neural Networks (CNN) model proposed here consists of a 24-layer architecture, like AlexNet, which is used for extracting features of characters and Support Vector Machine (SVM) is used for classifying the output characters

  • The handwritten characters vary in writing style, from person to person, an automated feature extraction process makes the Malayalam character recognition easier

Read more

Summary

Introduction

Optical Character Recognition (OCR) is the process of recognising handwritten text and printed text. The recognised texts are converted into an encoded format. Four essential steps in character recognition are: a) data pre-processing, b) segmentation, c) feature extraction, and d) classification [1]. The Malayalam script consists of 15 vowels, 36 consonants, and 5 pure consonants as shown in following Figure 1. There are 12 dependent vowels and 144 compound characters as shown in the Figure 2 and Figure 3 respectively. The characters are compounded both vertically and horizontally in Malayalam script

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call