Scene text recognition is a popular research topic which is also extensively utilized in the industry. Although many methods have achieved satisfactory performance for the close-set text recognition challenges, these methods lose feasibility in open-set scenarios, where collecting data or retraining models for novel characters could yield a high cost. For example, annotating samples for foreign languages can be expensive, whereas retraining the model each time when a “novel” character is discovered from historical documents costs both time and resources. In this paper, we introduce and formulate a new open-set text recognition task which demands the capability to spot and recognize novel characters without retraining. A label-to-prototype learning framework is also proposed as a baseline for the new task. Specifically, the framework introduces a generalizable label-to-prototype mapping function to build prototypes (class centers) for both seen and unseen classes. An open-set predictor is then utilized to recognize or reject samples according to the prototypes. The implementation of rejection capability over out-of-set characters allows automatic spotting of unknown characters in the incoming data stream. Extensive experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets.
Read full abstract