Joint optimization for attention-based generation and recognition of chinese characters using tree position embedding

Mobai Xue,Jun Du,Bin Wang,Bo Ren,Yu Hu

doi:10.1016/j.patcog.2023.109538

Abstract

Despite the growing interest in Chinese character generation, creating a nonexistent character remains an open challenge. Radical-based Chinese character generation is still a novel task while radical-based Chinese character recognition is more technologically advanced. To fully utilize the knowledge of recognition task, we first propose an attention-based generator. The generator chooses the most relevant radical to generate each zone with an attention mechanism. Then, we present a joint optimization approach to training generation-recognition models, which can help the generator and recognizer learn from each other effectively. The joint optimization is implemented via contrastive learning and dual learning. Considering the symmetry of the generation and recognition, contrastive learning aims to strengthen the performance of the encoder of recognizer and the decoder of generator. Since the generation and recognition tasks can form a closed loop, dual learning feeds the output from one to another as input. Based on the feedback signals generated during the two tasks, we can iteratively update the two models until convergence. Finally, as our model ignores the order information of a sequence, we exploit position embedding to extend the image representation ability and propose tree position embedding to represent the positional information for tree structure captions of Chinese characters. The experimental results in printed and nature scenes show that the proposed method improves the quality of the generating images and increases the recognition accuracy for Chinese characters.

Full Text