Cross-modal prototype learning for zero-shot handwritten character recognition

Xiang Ao,Xu-Yao Zhang,Cheng-Lin Liu

doi:10.1016/j.patcog.2022.108859

Abstract

Traditional methods of handwritten character recognition rely on extensive labeled data. However, humans can generalize to unseen handwritten characters by watching a few printed examples in textbooks. To simulate this ability, we propose a cross-modal prototype learning method (CMPL) to realize zero-shot recognition. For each character class, a prototype is generated by mapping the printed character into a deep neural network feature space. For unseen character class, its prototype can be directly produced from a printed character sample, therefore, not requiring any handwritten samples to realize class-incremental learning. Specifically, CMPL considers different modalities simultaneously - online handwritten trajectories, offline handwritten images, and auxiliary printed character images. The joint learning of the above modalities is achieved through sharing printed prototypes between online and offline data. In zero-shot inference, we feed CMPL the printed samples to obtain corresponding class prototypes, and then the unseen handwritten character can be recognized by the nearest prototype. Our experimental results demonstrate that CMPL outperforms the state-of-the-art methods in both online and offline zero-shot handwritten Chinese character recognition. Moreover, we also show the cross-domain generalization of CMPL from two perspectives: cross-language and modern-to-ancient handwritten character recognition, focusing on the transferability between different languages and different styles (i.e., modern and historical handwritings).

Full Text