Context–content collaborative network for building extraction from high-resolution imagery

Maoguo Gong,Tongfei Liu,Mingyang Zhang,Qingfu Zhang,Di Lu,Hanhong Zheng,Fenlong Jiang

doi:10.1016/j.knosys.2023.110283

Abstract

In practical applications, different application fields have various requirements regarding the precision and completeness of building extraction. Too low precision or completeness may limit the application and promotion of building extraction. Obtaining a good trade-off between the precision and completeness of building extraction is still a challenging issue. To deal with this issue, this paper proposes a context–content collaborative network (C3Net) with an encoder–decoder structure. It consists of a context–content aware module (C2AM) and an edge residual refinement module (ER2M). In the C2AM, a context-aware block and a content-aware block complement each other and capture the localization information of buildings and long-range dependencies between the locations of each building, respectively. Thanks to the capability of the conventional filter, the ER2M can refine the features of decoder output by deploying a residual atrous spatial pyramid pooling with feature edges at the scale of the original image. To explicitly guide the function of the ER2M, we introduce a separated deep supervision strategy before and after the ER2M, which can consciously refine our C3Net towards the precision or completeness of building extraction to a certain extent, and improve the overall detection performance. Compared with several classical and state-of-the-art methods, extensive experiments on three open and challenging datasets demonstrate that the proposed C3Net not only acquires competitive performance but also achieves a better trade-off between precision and completeness of building extraction. The source code is released at https://github.com/TongfeiLiu/C3Net-for-building-extraction.

Full Text