Abstract

Current convolutional neural network (CNN)-based methods for remote sensing image segmentation require a large number of densely annotated images for model training and have limited generalization abilities for unseen object categories. In this letter, we propose a novel few-shot learning-based method for the semantic segmentation of remote sensing images. Our method can perform semantic labeling for unseen object categories with only a few annotated samples. More specifically, our model starts by using a deep CNN to extract high-level semantic features. The prototype representation of each class is then generated by using a masked average pooling on the feature embeddings of the support images with ground truth masks. Finally, our model performs semantic labeling over the query images by matching the feature embedding of each pixel to its nearest prototypes in the embedding space. Our model is optimized with a nonparametric metric learning-based loss function to maximize the intra-class similarity of learned prototypes while minimizing the inter-class similarity. Experiments on International Society for Photogrammetry and Remote Sensing (ISPRS) 2-D semantic labeling dataset demonstrate satisfying in-domain and cross-domain transferring abilities of our model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call