Abstract

Handwritten character recognition is a problem that has been worked on for many mainstream languages. Handwritten letter recognition has been proven to achieve promising results. Several studies using deep learning models have been conducted to achieve better accuracies. In this paper, the authors conducted two experiments on the EMNIST Letters dataset: Wavemix-Lite and CoAtNet. The Wavemix-Lite model uses Two-Dimensional Discrete Wavelet Transform Level 1 to reduce the parameters and speed up the runtime. The CoAtNet is a combined model of CNN and Visual Transformer where the image is broken down into fixed-size patches. The feature extraction part of the model is used to embed the input image into a feature vector. From those two models, the authors hooked the value of the features of the Global Average Pool layer using EMNIST Letters data. The features hooked from the training results of the two models, such as SVM, Random Forest, and XGBoost models, were used to train the machine learning classifier. The experiments conducted by the authors show that the best machine-learning model is the Random Forest, with 96.03% accuracy using the Wavemix-Lite model and 97.90% accuracy using the CoAtNet model. These results showcased the benefit of using a machine learning model for classifying image features that are extracted using a deep learning model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call