SECANet: A structure‐enhanced attention network with dual‐domain contrastive learning for scene text image super‐resolution

Xin He,Kaibing Zhang,Hui Zhang,Yuhong Zhang

doi:10.1049/ell2.13057

Abstract

AbstractIn this letter, we developed novel Structure Enhanced Channel Attention Network (SECANet) for scene text image super‐resolution (STISR). The newly proposed SECANet integrates a group of Structure‐Enhanced Attention Modules to focus more on both local and global structural features in the character regions of text images. Moreover, we elaborately formulate a Dual‐Domain Contrastive Learning framework that integrates one pixel‐level contrastive loss and the other semantic‐level contrastive loss to jointly optimize the SECANet for generating more visually pleasing yet better recognizable high‐quality SR images without introducing any additional prior generators in both the training and testing stages, showing promising computational efficiency. Experimental results on the Textzoom dataset indicate that our method can achieve both decent performance in super‐resolving more impressive scene text images from low‐resolution ones and better recognition accuracy than other competitors.

Full Text