Thangka Image Segmentation Method Based on Enhanced Receptive Field

Hao Wang,Jingyun Hu,Ru Xue,Guangxiu Pan,Yue Liu

doi:10.1109/access.2022.3201086

Hao Wang, Jingyun Hu + Show 3 more

Open Access

https://doi.org/10.1109/access.2022.3201086

Copy DOI

Abstract

The portrait thangka image is a kind of religious scroll painting that expresses figures’ identity and duties through portraits, sitting platforms, and backlighting. The segmentation of significant semantic objects in the image is one of the essential ways for scholars to study and understand the image’s content. To better understand this content, we elaborately collected a dataset of portrait-like thangkas, which consists of 4086 images covering four object categories. We provide rich annotation for this dataset. In addition, we propose an end-to-end deep learning method that effectively solves the problems of blurred target edges, segmentation errors, and missed segmentation in thangka image segmentation. First, regular convolution and atrous convolution of different sizes are concatenated after the high-level feature output. This method can effectively improve the receptive field of the model while obtaining more image feature information. Then, the attention module is introduced to fully utilize the spatial relationship between the image’s semantic content and enhance the discriminative ability of the feature representation on the thangka image. Finally, cross-layer feature fusion is added to reduce the loss of edge details and improve the accuracy of target edge segmentation. The results show that compared to the base model, the mPA and mIoU indices of the model proposed in this paper reach 90.75% and 85.66%, respectively, which effectively improved the accuracy of the Tangka image segmentation.

Full Text