Accomplishing high recognition performance is considered one of the most important tasks for handwritten Arabic character recognition systems. In general, Optical Character Recognition (OCR) systems are constructed from four phases: pre-processing, feature extraction, feature selection, and classification. Recent literature focused on the selection of appropriate features as a key point towards building a successful and sufficient character recognition system. In this paper, we propose a hybrid machine learning approach that utilizes neighborhood rough sets with a binary whale optimization algorithm to select the most appropriate features for the recognition of handwritten Arabic characters. To validate the proposed approach, we used the CENPARMI dataset, which is a well-known dataset for machine learning experiments involving handwritten Arabic characters. The results show clear advantages of the proposed approach in terms of recognition accuracy, memory footprint, and processor time than those without the features of the proposed method. When comparing the results of the proposed method with other recent state-of-the-art optimization algorithms, the proposed approach outperformed all others in all experiments. Moreover, the proposed approach shows the highest recognition rate with the smallest consumption time compared to deep neural networks such as VGGnet, Resnet, Nasnet, Mobilenet, Inception, and Xception. The proposed approach was also compared with recently published works using the same dataset, which further confirmed the outstanding classification accuracy and time consumption of this approach. The misclassified failure cases were studied and analyzed, which showed that they would likely be confusing for even Arabic natives because the correct interpretation of the characters required the context of their appearance.
Read full abstract