Abstract
The accurate recognition of respondents’ handwritten solutions is important for implementing intelligent diagnosis and tutoring. This task is significantly challenging because of scribbled and irregular writing, especially when handling primary or secondary students whose handwriting has not yet been fully developed. Recognition becomes difficult in such cases even for humans relying only on the visual signals of handwritten content without any context. However, despite decades of work on handwriting recognition, few studies have explored the idea of utilizing external information (question priors) to improve the accuracy. Based on the correlation between questions and solutions, this study aims to explore whether question-texts can improve the recognition of handwritten mathematical expressions (HMEs) in respondents’ solutions. Based on the encoder–decoder framework, which is the mainstream method for HME recognition, we propose two models for fusing question-text signals and handwriting-vision signals at the encoder and decoder stages, respectively. The first, called encoder-fusion, adopts a static query to implement the interaction between two modalities at the encoder phase, and to better catch and interpret the interaction, a fusing method based on a dynamic query at the decoder stage, called decoder-attend is proposed. These two models were evaluated on a self-collected dataset comprising approximately 7k samples and achieved accuracies of 62.61% and 64.20%, respectively, at the expression level. The experimental results demonstrated that both models outperformed the baseline model, which utilized only visual information. The encoder fusion achieved results similar to those of other state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.