Abstract
Script identification is crucial for document analysis and optical character recognition (OCR). This study proposes YafNet, a novel convolutional neural network (CNN) architecture, developed from scratch, to tackle the challenges of script identification in both handwritten and printed word images. YafNet dynamically weights features, enabling it to learn and combine multimodal features for accurate script identification. To evaluate its efficacy, we use the imbalanced ICDAR 2021 Script Identification in the Wild (SIW 2021) competition dataset. Experimental results demonstrate that YafNet outperforms conventional approaches, particularly when trained on mixed handwritten and printed data. It achieves high classification accuracy, balanced accuracy, and ROC AUC scores, indicating its robustness and generalizability. The incorporation of data augmentation and external data further enhances performance, underscoring the model's potential for real-world applications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have