Advancements in UAV image semantic segmentation: A comprehensive literature review

Shouket A. Ahmed,Poh Soon Joseph Ng,Taha A. Taha,Abadal-Salam T. Hussain,Raed Abdulkareem Hasan,Hazry Desa,Haider Kh. Easa,Sinan Q. Salih,Omer K. Ahmed

doi:10.31893/multirev.2024118

Abstract

Unmanned Aerial Vehicles (UAVs) have revolutionized data acquisition across various domains, presenting immense potential for image processing and semantic segmentation. This literature review encompasses a thorough exploration of advancements, techniques, challenges, and datasets pertaining to UAV image semantic segmentation. It begins by introducing the fundamental concepts of UAVs, highlighting their pivotal role in capturing high-resolution imagery that serves diverse applications. The integration of deep learning algorithms with UAVs is emphasized, unlocking new horizons in autonomous flight, security, and environmental monitoring. Delving into the core principles of semantic segmentation, the review elucidates the critical task of classifying every pixel in an image. Convolutional Neural Networks (CNNs) are presented as the cornerstone technology, tracing their evolution from traditional CNNs to the highly adaptable Fully Convolutional Networks (FCNs). A substantial portion of the review is dedicated to FCNs, underscoring their ability to process images of varying dimensions while maintaining spatial coherence in the output. Their pivotal role in semantic segmentation, encompassing both classification and localization, is articulated. The subsequent sections delve into a comprehensive survey of state-of-the-art models, including SegNet, PSPNet, DeepLabNet, EfficientNet, DenseNet-C, and LinkNet. Each model's unique strengths and applications contribute to the evolving landscape of semantic segmentation tasks. The versatility of the U-Net architecture takes center stage in the latter parts of the review. Its fundamental structure is elucidated, followed by a comprehensive examination of its manifold adaptations—3D-U-Net, ResU-Net, U-Net++, Adversarial U-Net, Cascaded U-Net, and Improved U-Net 3+. These modifications address intrinsic challenges such as limited receptive fields and class imbalances, propelling U-Net to the forefront of image segmentation techniques. The subsequent sections pivot toward the application of U-Net in UAV image segmentation, illustrating its efficacy in diverse tasks, including land cover and crop classification. Nevertheless, persisting challenges, such as the scarcity of annotated datasets and the need for model generalization across varied environmental conditions, remain key areas of concern. The review culminates by underlining the significance of large, authentic datasets and data augmentation techniques. Furthermore, a brief exploration of publicly available UAV image datasets is presented, enhancing our understanding of the resources accessible for training and evaluating models. This comprehensive literature review encapsulates the dynamism of UAV image processing and semantic segmentation, illuminating recent developments and avenues for future research in this burgeoning field.

Full Text