Few-shot prototype alignment regularization network for document image layout segementation

Yujie Li,Pengfei Zhang,Xing Xu,Yi Lai,Fumin Shen,Lijiang Chen,Pengxiang Gao

doi:10.1016/j.patcog.2021.107882

Abstract

Despite the great performance in layout analysis tasks made by semantic segmentation, they usually need a large number of annotated images for training and are difficult to learn a new category which is absent in the training categories. Meta-learning and few-shot segmentation have been developed to solve the above two difficulties. In this paper, we propose a novel method dubbed Few-Shot Prototype Alignment Regularization Network (FS-PARN). The FS-PARN method is inspired by recent studies in both metric learning and few-shot segmentation, which just need a few annotated images to solve the above two difficulties. Our FS-PARN method can make better use of the information of the support set by metric learning and have a better effect on image segmentation. It learns classification prototype within an embedding space and then completes pixel classification by matching each pixel on the query image with the learned prototype. In addition to obtaining high-quality prototypes through metric learning methods, our FS-PARN method also introduces prototype alignment regularization between support and query sets to make segmentation better. Notably, our FS-PARN model achieves the mean-IoU score of 28.8% and 31.7% on the practical document image datasets, i.e. PASCAL-5i, DSSE-200, and Layout Analysis Dataset, for 1-shot and 5-shot settings respectively.

Full Text