Abstract

Recently, thanks to the successful application of the attention-based encoder-decoder framework, handwritten mathematical expression recognition (HMER) has achieved significant improvement. However, HMER is still a challenging task in the handwriting recognition area, which suffers from the ambiguity of handwritten symbols, the two-dimensional structure of mathematical expressions, and the lack of labeled data. In this paper, we attempt to improve the recognition performance and generalization ability of the existing state-of-the-art method from two perspectives: data augmentation and model design. We first propose a tree-based multi-level (including symbol level, sub-expression level, and image level) data augmentation strategy, which can generate many synthetic images. Then, we present a novel encoder-decoder hybrid model via tree-based mutual learning to fully utilize the complementarity between tree decoder and string decoder. Benefitting from our data augmentation strategy, we achieve 58.47%/57.82%/62.67% and 74.45% expression recognition accuracy respectively on the CROHME14/16/19 competition datasets and the OffRaSHME20 competition dataset. Moreover, tree-based data augmentation is a key technology to our champion system for the OffRaSHME20 competition. Our tree-based mutual learning method further improves the recognition accuracy to 61.63%/59.81%/64.38% and 75.68% on these datasets. Further quantitative and qualitative analyses also demonstrate the effectiveness and robustness of our proposed methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call