Abstract

ABSTRACT Clouds lead to missing or distorted land-related information in impacted areas in optical remote sensing images. Cloud masking, which labels cloud-contaminated pixels, forms the basis for subsequent image utilization, such as excluding the distorted pixel or filling in the missing area. However, due to the diverse spectral, textural, and shape characteristics of different clouds and complicated combinations with the underlying land surfaces, cloud masking has become a challenge in remote sensing image processing. In recent years, the Mask region-based convolutional neural network (R-CNN) method, which performs instance segmentation from a complex background and generates a pixelwise mask for the object of interest, has been used widely in object segmentation tasks. When the Mask R-CNN method is used for cloud masking, the mask result has certain problems, such as failing to extract uncommon clouds and outputting inaccurate mask boundaries for large clouds. To address these problems, we introduce two strategies, group training and boundary optimization, to improve the Mask R-CNN. For group training, samples are divided into several groups. The samples in the first group are used for the initial training, and the samples in the next group are used for evaluation. Only samples with missing or falsely detected clouds are used for tuning the classifier; then, these processes are repeated until all groups have been used or the detection precision becomes stable. For boundary optimization, a block-by-block mask strategy is adopted to guarantee that clouds with diverse sizes have similar performances. Finally, two open data sets and one data set labelled by ourselves are selected to test the proposed method, and the results demonstrate that our method can produce cloud masks for different cloud types and diverse underlying land surfaces and can achieve high accuracies, thereby providing an effective alternative for cloud masking. Compared with the original Mask R-CNN method, our method improves the average recall, average precision, and intersection over union by 5.88%, 2.4%, and 0.071 in pixel level, respectively, demonstrating the effectiveness of our improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call