Domain gaps between synthetic text and real-world text restrict current text recognition methods. One solution is to align features through Unsupervised Domain Adaptation (UDA). Most existing UDA-based text recognition methods extract global and local features to alleviate domain differences, only focusing on character-level distribution gaps. However, notable distribution gaps in character combinations exert a pivotal influence on diverse text recognition tasks. To this end, we propose a Multi-level And multi-Granularity domain adaptation with entropy loss guIded text reCognition model, named MAGIC. It integrates Global-level Domain Adaptation (GDA) to mitigate image-level domain drift and Local-level Multi-granularity Domain Adaptation (LMDA) for local feature shifts. Particularly, we design a subword-level domain discriminator to align the subword features relating to each character combination. Moreover, multi-granularity entropy minimization is used to optimize the target domain data for better domain adaptation. Experimental results on several types of text datasets demonstrate the effectiveness of MAGIC.
Read full abstract