AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing

Dong Chen,Fan Hu,Liqiang Zhang,Jiju Peethambaran,Guiqiu Xiang,Jing Li

doi:10.3390/rs13245039

Dong Chen, Fan Hu + Show 4 more

Open Access

PDF Available

https://doi.org/10.3390/rs13245039

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies.

Highlights

IntroductionPublisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations
AFGL-Net Deep Neural Network: We present AFGL-Net a deep neural network for building façade parsing
K is the number of classes in the dataset, and pii denotes the number of ground truth point clouds with class label i that are correctly predicted into the same class i; Parameters pij and p ji denotes the number of points that have class label i/j but incorrectly predicted as class j/i

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Buildings and architectures represent the most fundamental and important element of cities. 3D models of buildings have been widely used in many modern day applications such as indoor and/or outdoor navigation [1,2], building energy modeling [3], Recently, deep learning has shown amazing results for point cloud semantic segmentation. Because the point cloud is scattered, unordered, and unorganized, it Remote Sens.

Methods

Results

Conclusion