Abstract

Writing is an important carrier of cultural inheritance, and the digitization of handwritten texts is an effective means to protect national culture. Compared to Chinese and English handwriting recognition, the research on Mongolian handwriting recognition started relatively late and achieved few results due to the characteristics of the script itself and the lack of corpus. First, according to the characteristics of Mongolian handwritten characters, the random erasing data augmentation algorithm was modified, and a dual data augmentation (DDA) algorithm was proposed by combining the improved algorithm with horizontal wave transformation (HWT) to augment the dataset for training the Mongolian handwriting recognition. Second, the classical CRNN handwriting recognition model was improved. The structure of the encoder and decoder was adjusted according to the characteristics of the Mongolian script, and the attention mechanism was introduced in the feature extraction and decoding stages of the model. An improved handwriting recognition model, named the EGA model, suitable for the features of Mongolian handwriting was suggested. Finally, the effectiveness of the EGA model was verified by a large number of data tests. Experimental results demonstrated that the proposed EGA model improves the recognition accuracy of Mongolian handwriting, and the structural modification of the encoder and coder effectively balances the recognition accuracy and complexity of the model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call