Abstract

Handwritten word recognition is undoubtedly a challenging task due to various writing styles of individuals. So, lots of efforts are put to recognize handwritten words using efficient classifiers based on extracted features that rely on the visual appearance of the handwritten text. Due to numerous real-time applications, handwritten word recognition is an important research area which is seeking a lot of attention from researchers for the last 10 years. In this article, the authors have proposed a holistic approach and eXtreme Gradient Boosting (XGBoost) technique to recognize offline handwritten Gurumukhi words. In this direction, four state-of-the-art features like zoning, diagonal, intersection & open-end points and peak extent features have been considered to extract discriminant features from the handwritten word digital images. The proposed approach is evaluated on a public benchmark dataset of Gurumukhi script that comprises 40,000 samples of handwritten words. Based on extracted features, the words are classified into one of the 100 classes based on XGBoost technique. Effectiveness of the system is assessed based on several evaluation parameters like CPU elapsed time, accuracy, precision, recall, F1-score and area under curve (AUC). XGBoost technique attained the best results of accuracy (91.66%), recall (91.66%), precision (91.39%), F1-score (91.14%) and AUC (95.66%) using zoning features based on 90% data as the training set and remaining 10% data as the testing set. The comparison of the proposed approach with the existing approaches has also been done which reveals the significance of the XGBoost technique comparatively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.