Abstract

Accurate positioning of curtain wall frames is crucial for the automated installation of curtain wall modules. However, the current robot-based installation methods overly depend on visual guidance from operators, resulting in high costs and limiting construction efficiency. The development of deep learning has introduced an image segmentation approach that offers a new solution for the visual positioning of curtain wall frames. This paper proposes a context collaboration pyramid network to automatically segment curtain wall frames by incorporating context interaction and channel guided pyramid structure. The model adopts an “encoder-decoder” architecture with a feature interaction block strategically inserted between the encoder and decoder. Specifically, the encoder utilizes the pyramid pooling Transformer as a backbone to extract multi-level features from original RGB images. The decoder employs a channel guided pyramid convolution module to integrate multi-scale features and achieve finer prediction. Meanwhile, a context interaction fusion module between the features of adjacent levels was designed carefully to enhance the collaboration of the architecture. In addition, a benchmark dataset for the curtain wall frame segmentation task, consisting of 1547 images, was established. The dataset incorporates challenging scenarios, including strong lights, low contrast, and cluttered backgrounds. This method is evaluated on the collected dataset, and achieves an impressive accuracy of 97.30% and an F1-Score of 88.95%, outperforming other segmentation networks. Overall, the proposed method can extract target information accurately and efficiently and provide critical visual guidance for the robot, so as to promote the automatic installation level of the curtain wall module.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call