Abstract

Hyperspectral remote sensing imaging technology provides assistance in various aspects of daily life through applications such as urban building information statistics and green vegetation estimation. Ensuring the accuracy of automatic thematic information extraction under limited samples is a challenge. In this manuscript, a lightweight semantic segmentation model based on the “encoder-decoder” structure is proposed for extracting buildings from hyperspectral remote sensing images. The proposed model employs the lightweight MobileNet combined with multiscale feature fusion and a group dilated convolution for modelling both shallow and deep spatial and spectral features as the encoder and an efficient combined standardized attention mechanism for selecting the most valuable bands and local information. Extensive experiments reveal that our method produces greater accuracy than state-of-the-art lightweight models in building extraction tasks. We also demonstrated the superiority of our method for insufficient training sample sizes. When only 50% of the samples of the initial training set were used, the mean intersection over union (mIOU) reached 91.90%, 4.5% higher than that of the next best method. For training sets composed of only 16 and 8 images, the mIOU values were 89.42 and 77.11%, respectively, 13.6 and 18 percentage points higher than that of the next best method. According to the visualization of the results, the proposed method obviously outperformed the compared methods. The model proposed in this paper is suitable for accurately extracting buildings from hyperspectral images in situations involving limited training samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call