HighlightsA paddy field segmentation network model, SA-DeepLabv3+, for hilly areas in southern China is proposed.SA-DeepLabv3+ uses attention mechanism and adaptive spatial feature fusion algorithm.The deep model was trained end-to-end using a few annotated image datasets.SA-DeepLabv3+ has good segmentation results for the four common types of paddy fields.Abstract. The hilly areas of southern China have complex feature distributions, and the boundaries of cultivated land are irregular. This kind of environment causes several image segmentation problems such as low accuracy of UAV low-altitude remote sensing arable land image interpretation. This paper proposes a SA-DeepLabv3+ paddy field segmentation model that combines attention mechanism and adaptive spatial feature fusion algorithm based on DeepLabv3+ semantic segmentation model. The sampling areas include Yangwan Village, Duchang County, Jiangxi Province, China, Keli Village, Xinjian County, Nanchang City and Chengxin Farm, Nanchang City. A high-resolution image dataset was found by capturing low-altitude remote sensing images of paddy fields with DJI UAVs, and the dataset images are categorised into four types of paddy fields. By analysing the image features of the dataset, the feature map is remapped by incorporating scSE attention mechanism module in the backbone network of the DeepLabv3+ encoder to enhance the paddy field feature representation. The ASFF algorithm is applied in the decoder, which obtains an adaptive weight coefficient to improve the problem of inadequate fusion of multi-scale features in the upsampling. By comparing the segmentation results of models with different ASPP dilation rates, the optimal set of dilation rates is determined to improve the model's ability to extract fine features of paddy fields. Finally, the performance of this model for paddy fields segmentation is verified by comparison experiments of different models. The data show that the mean values of PA, MIoU, Recall, and F1 score of the SA-DeepLabv3+ paddy field segmentation model are 0.955, 0.875, 0.865, and 0.908, respectively, which are improved by 0.019, 0.032, 0.008, and 0.005, and also has a higher segmentation performance compared with the typical segmentation models of UNet, SegNet, and PSPNet. The results show that the SA-DeepLabv3+ paddy field segmentation model with high accuracy and robustness, which provides an important basis for further acquiring high-precision paddy field boundary positioning information and constructing high-precision maps of several paddy fields in larger areas, and plays a positive role in promoting efficient and accurate information management of paddy fields. Keywords: Attention mechanism, ASFF, DeepLabv3+ network, Paddy field.
Read full abstract