Abstract

Semantic segmentation in aerial imagery is still an important, yet challenging task due to the complex characteristics of remote-sensing data. The critical issues consist of: 1) extreme foreground–background imbalance; 2) large intra-class variance; and 3) arbitrary-oriented, dense, and small objects. The above challenges make it unlikely to model the effective global interdependencies of semantic heterogeneous regions. Besides, general semantic segmentation methods suffer from feature ambiguity due to the joint feature learning paradigm, leading to inferior detail information. In this article, we propose an improved semantic segmentation framework to tackle these problems via graph reasoning (GR) and disentangled learning. On the one hand, a simple, yet effective GR unit is introduced to implement coordinate-interaction space mapping and perform relation reasoning over the graph. It can be deployed on the feature pyramid network (FPN) to exploit cross-stage multi-scale information. On the other hand, we propose a so- called disentangled learning paradigm to explicitly model the foreground and boundary objects, instantiated as foreground prior estimation (FPE) and boundary alignment (BA). The indication of the intermediate feature can be effectively emphasized to enhance the discriminative abilities of the network. Extensive experiments over iSAID, ISPRS Vaihingen, and the general Cityscapes datasets demonstrate the effectiveness and efficiency of the proposed framework over other state-of-the-art semantic segmentation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call