Abstract

Accurate extraction of ground objects in mining areas using high spatial resolution imagery can effectively assist in the arrangement and deployment of the production work in mining areas. Therefore, it is essential to extract ground objects of mining areas quickly and accurately. However, the task is very difficult due to the variety of complex scenes, the high intra-class variation, low inter-class variation, similar appearance, and large size difference among the ground objects in mining areas. Traditional extraction methods based on RGB images can no longer meet the requirement of high precision. To solve these problems, we propose an attention-based multi-level feature fusion network (AMFNet), which extracts objects more accurately from unmanned aerial vehicle-based RGB images and digital surface model (DSM). The DSM of the mining area provides the height information of the ground objects, which is another supplementary key feature while the objects cannot be distinguished only by appearance in the RGB images. AMFNet is a dual-branch network of encoder–decoder structure using RGB images and DSM as the double-branch input separately. Specifically, the channel attention module, MF, and atrous convolution are introduced in the encoder network to make better use of dual-branch features. The decoder network upsamples the deep features and uses skip connections to fuse the multi-level shallow features obtained by the encoder network with the deep features. Finally, we set up multiple comparisons and ablation experiments in the test dataset, which demonstrate the performance advantages of AMFNet from different dimensions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call