Navigation mark detection based on deep learning models from UAV images

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract Prosperous waterway economics require rigorous safety measures. Unmanned aerial vehicle (UAV) offers massive images of inland waterways, within which navigation mark detection plays a critical role in ensuring waterway safety. This paper proposes a deep learning-based method for detecting navigation marks in UAV images. Firstly, a dataset of inland waterway navigation marks is constructed from UAV aerial images, which includes data collection, image enhancement, sample creation, and sample annotation. Secondly, a deep learning network model is developed, which uses ResNet-50 as the backbone, incorporates Coordinate Attention and Large-Scale Selective Kernel Attention mechanisms, integrates a Feature Pyramid Network (FPN) for feature enhancement, and uses Distance Intersection over Union (DIoU) as the loss function. Thirdly, the model is trained and evaluated on the constructed dataset, followed by precision assessment and post-processing. This paper explore a deep learning network model for small object detection in UAV images and establish a comprehensive workflow for detecting inland waterway navigation marks, thereby providing technical support for waterway safety.

Similar Papers
  • Research Article
  • Cite Count Icon 64
  • 10.1016/j.neucom.2020.08.074
An empirical study of multi-scale object detection in high resolution UAV images
  • Sep 28, 2020
  • Neurocomputing
  • Haijun Zhang + 5 more

An empirical study of multi-scale object detection in high resolution UAV images

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/indin41052.2019.8972320
Learning-based Object Detection in High Resolution UAV Images: An Empirical Study
  • Jul 1, 2019
  • Haijun Zhang + 4 more

Deep learning-based methods are continuously boosting the performance of detecting objects in natural images. On the contrary, detecting objects in Unmanned Aerial Vehicle (UAV) images remains to be a difficult task in the field of computer vision, due to the challenge of training a well-performed detection model working on UAV images which usually contain instances with varied orientations, scales, and contours, etc. Furthermore, only a few researchers have focused on this field, probably because of difficulties in UAV data acquisition and labelling. Inspired by this, we collected a large-scale dataset with multi-scale and high-resolution UAV images, named MOHR, which contains 10,631 images captured by a UAV affixed with three kinds of cameras. Since these images were captured in a suburban environment, we manually annotated five classes of objects, including car, truck, building, collapse and flood damage. An empirical study was then conducted by adopting six advanced object detection methods all of which are based on deep learning technologies. The results indicate the great potential of these evaluated object detection models, but also reveal that the research on such a challenging UAV dataset using current deep learning techniques is far reaching.

  • Research Article
  • Cite Count Icon 153
  • 10.1016/j.jag.2017.05.002
Comparison of UAV and WorldView-2 imagery for mapping leaf area index of mangrove forest
  • May 12, 2017
  • International Journal of Applied Earth Observation and Geoinformation
  • Jinyan Tian + 6 more

Comparison of UAV and WorldView-2 imagery for mapping leaf area index of mangrove forest

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 54
  • 10.3390/rs9040376
Automatic UAV Image Geo-Registration by Matching UAV Images to Georeferenced Image Data
  • Apr 17, 2017
  • Remote Sensing
  • Xiangyu Zhuo + 4 more

Recent years have witnessed the fast development of UAVs (unmanned aerial vehicles). As an alternative to traditional image acquisition methods, UAVs bridge the gap between terrestrial and airborne photogrammetry and enable flexible acquisition of high resolution images. However, the georeferencing accuracy of UAVs is still limited by the low-performance on-board GNSS and INS. This paper investigates automatic geo-registration of an individual UAV image or UAV image blocks by matching the UAV image(s) with a previously taken georeferenced image, such as an individual aerial or satellite image with a height map attached or an aerial orthophoto with a DSM (digital surface model) attached. As the biggest challenge for matching UAV and aerial images is in the large differences in scale and rotation, we propose a novel feature matching method for nadir or slightly tilted images. The method is comprised of a dense feature detection scheme, a one-to-many matching strategy and a global geometric verification scheme. The proposed method is able to find thousands of valid matches in cases where SIFT and ASIFT fail. Those matches can be used to geo-register the whole UAV image block towards the reference image data. When the reference images offer high georeferencing accuracy, the UAV images can also be geolocalized in a global coordinate system. A series of experiments involving different scenarios was conducted to validate the proposed method. The results demonstrate that our approach achieves not only decimeter-level registration accuracy, but also comparable global accuracy as the reference images.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1007/s40747-023-01076-6
HRCTNet: a hybrid network with high-resolution representation for object detection in UAV image
  • May 15, 2023
  • Complex & Intelligent Systems
  • Wenjie Xing + 2 more

Object detection in unmanned aerial vehicle (UAV) images has attracted the increasing attention of researchers in recent years. However, it is challenging for small object detection using conventional detection methods because less location and semantic information are extracted from the feature maps of UAV images. To remedy this problem, three new feature extraction modules are proposed in this paper to refine the feature maps for small objects in UAV images. Namely, Small-Kernel-Block (SKBlock), Large-Kernel-Block (LKBlock), and Conv-Trans-Block (CTBlock), respectively. Based on these three modules, a novel backbone called High-Resolution Conv-Trans Network (HRCTNet) is proposed. Additionally, an activation function Acon is deployed in our network to reduce the possibility of dying ReLU and remove redundant features. Based on the characteristics of extreme imbalanced labels in UAV image datasets, a loss function Ployloss is adopted to train HRCTNet. To verify the effectiveness of the proposed HRCTNet, corresponding experiments have been conducted on several datasets. On VisDrone dataset, HRCTNet achieves 49.5% on AP50 and 29.1% on AP, respectively. As on COCO dataset, with limited FLOPs, HRCTNet achieves 37.9% on AP and 24.1% on APS. The experimental results demonstrate that HRCTNet outperforms the existing methods for object detection in UAV images.

  • Research Article
  • Cite Count Icon 3
  • 10.1080/01431161.2024.2429781
A spatio-temporal-spectral fusion framework for downscaling Sentinel-2 images using UAV images
  • Nov 25, 2024
  • International Journal of Remote Sensing
  • Weikai Zhang + 1 more

The Sentinel-2 multispectral (S2-MS) images, equipped with three red-edge (Red-E) bands, serve as an optimal data source for vegetation monitoring. However, its spatial resolution of 10–20 m restricts greatly its utility for local, precise monitoring. The widely used consumer-grade unmanned aerial vehicle (UAV) provides much finer spatial resolution images but typically only in the visible and near-infrared spectral bands. UAV and S2-MS images have strong complementarity in spatial, temporal, and spectral resolution. This paper establishes a spatio-temporal-spectral (STS) fusion framework for downscaling S2-MS images using UAV images. First, the spatio-temporal (ST) fusion method of Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) is applied to the spatio-spectral (SS) fusion of UAV and S2-MS images, and it is verified to perform better than the existing SS fusion methods and exhibits robustness across spatial scales. Then, CA-STARFM is generated by coupling STARFM with Consistent Adjustment of the Climatology to Actual Observations (CACAO) and used to further optimize SS fusion results, yielding more competent performance. Moreover, the applicability of CA-STARFM to STS fusion is further verified based on the UAVlike image generated by the ST fusion of UAV and S2-MS images. The results indicate that STARFM is competent for SS fusion at large spatial scales, while CA-STARFM can not only optimize the ST fusion of UAV and satellite images but also be promising for SS fusion. Therefore, the proposed fusion framework provides a potential solution to integrate spatial, temporal, and spectral information of UAV and S2-MS images for precise monitoring.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 112
  • 10.3390/app9040643
Impact of Texture Information on Crop Classification with Machine Learning and UAV Images
  • Feb 14, 2019
  • Applied Sciences
  • Geun-Ho Kwak + 1 more

Unmanned aerial vehicle (UAV) images that can provide thematic information at much higher spatial and temporal resolutions than satellite images have great potential in crop classification. Due to the ultra-high spatial resolution of UAV images, spatial contextual information such as texture is often used for crop classification. From a data availability viewpoint, it is not always possible to acquire time-series UAV images due to limited accessibility to the study area. Thus, it is necessary to improve classification performance for situations when a single or minimum number of UAV images are available for crop classification. In this study, we investigate the potential of gray-level co-occurrence matrix (GLCM)-based texture information for crop classification with time-series UAV images and machine learning classifiers including random forest and support vector machine. In particular, the impact of combining texture and spectral information on the classification performance is evaluated for cases that use only one UAV image or multi-temporal images as input. A case study of crop classification in Anbandegi of Korea was conducted for the above comparisons. The best classification accuracy was achieved when multi-temporal UAV images which can fully account for the growth cycles of crops were combined with GLCM-based texture features. However, the impact of the utilization of texture information was not significant. In contrast, when one August UAV image was used for crop classification, the utilization of texture information significantly affected the classification performance. Classification using texture features extracted from GLCM with larger kernel size significantly improved classification accuracy, an improvement of 7.72%p in overall accuracy for the support vector machine classifier, compared with classification based solely on spectral information. These results indicate the usefulness of texture information for classification of ultra-high-spatial-resolution UAV images, particularly when acquisition of time-series UAV images is difficult and only one UAV image is used for crop classification.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 56
  • 10.3390/rs11101226
Rapid Mosaicking of Unmanned Aerial Vehicle (UAV) Images for Crop Growth Monitoring Using the SIFT Algorithm
  • May 23, 2019
  • Remote Sensing
  • Jianqing Zhao + 6 more

To improve the efficiency and effectiveness of mosaicking unmanned aerial vehicle (UAV) images, we propose in this paper a rapid mosaicking method based on scale-invariant feature transform (SIFT) for mosaicking UAV images used for crop growth monitoring. The proposed method dynamically sets the appropriate contrast threshold in the difference of Gaussian (DOG) scale-space according to the contrast characteristics of UAV images used for crop growth monitoring. Therefore, this method adjusts and optimizes the number of matched feature point pairs in UAV images and increases the mosaicking efficiency. Meanwhile, based on the relative location relationship of UAV images used for crop growth monitoring, the random sample consensus (RANSAC) algorithm is integrated to eliminate the influence of mismatched point pairs in UAV images on mosaicking and to keep the accuracy and quality of mosaicking. Mosaicking experiments were conducted by setting three types of UAV images in crop growth monitoring: visible, near-infrared, and thermal infrared. The results indicate that compared to the standard SIFT algorithm and frequently used commercial mosaicking software, the method proposed here significantly improves the applicability, efficiency, and accuracy of mosaicking UAV images in crop growth monitoring. In comparison with image mosaicking based on the standard SIFT algorithm, the time efficiency of the proposed method is higher by 30%, and its structural similarity index of mosaicking accuracy is about 0.9. Meanwhile, the approach successfully mosaics low-resolution UAV images used for crop growth monitoring and improves the applicability of the SIFT algorithm, providing a technical reference for UAV application used for crop growth and phenotypic monitoring.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 72
  • 10.3390/rs12193140
Object Detection in UAV Images via Global Density Fused Convolutional Network
  • Sep 24, 2020
  • Remote Sensing
  • Ruiqian Zhang + 4 more

Object detection in Unmanned Aerial Vehicle (UAV) images plays fundamental roles in a wide variety of applications. As UAVs are maneuverable with high speed, multiple viewpoints, and varying altitudes, objects in UAV images are distributed with great heterogeneity, varying in size, with high density, bringing great difficulty to object detection using existing algorithms. To address the above issues, we propose a novel global density fused convolutional network (GDF-Net) optimized for object detection in UAV images. We test the effectiveness and robustness of the proposed GDF-Nets on the VisDrone dataset and the UAVDT dataset. The designed GDF-Net consists of a Backbone Network, a Global Density Model (GDM), and an Object Detection Network. Specifically, GDM refines density features via the application of dilated convolutional networks, aiming to deliver larger reception fields and to generate global density fused features. Compared with base networks, the addition of GDM improves the model performance in both recall and precision. We also find that the designed GDM facilitates the detection of objects in congested scenes with high distribution density. The presented GDF-Net framework can be instantiated to not only the base networks selected in this study but also other popular object detection models.

  • Research Article
  • Cite Count Icon 12
  • 10.1016/j.jag.2022.102677
Road marking extraction in UAV imagery using attentive capsule feature pyramid network
  • Jan 22, 2022
  • International Journal of Applied Earth Observation and Geoinformation
  • Haiyan Guan + 6 more

Road marking extraction in UAV imagery using attentive capsule feature pyramid network

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 16
  • 10.1080/21642583.2023.2247082
Small object detection in UAV image based on improved YOLOv5
  • Aug 15, 2023
  • Systems Science & Control Engineering
  • Jian Zhang + 5 more

Less effective information is obtained by the object detection network, due to the small size of the detection object in the entire image, the complex background, and the dense object in unmanned aerial vehicle (UAV) images. In response to the difficulties encountered, a small object detection method in UAV images is proposed as an improved YOLOv5-based algorithm in this paper. First, the space-to-depth(SPD) conv module is introduced into the basic feature extraction network, to improve significant loss of image information during downsampling. Then, various attention mechanisms are added, to intensify the acquisition of regions of interest in UAV images. Finally, the multiscale detection module is improved, to enhance the network's ability to detect small objects in UAV images. By conducting experiments on the VisDrone-DET2019 dataset, the test results of the established model show. The improved algorithm achieved a Mean Average Precision (mAP) of 41.8%, which is 7.8% better than the baseline network. In addition, the detection performance is better than most current mainstream target detection algorithms and is of some practical value.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 71
  • 10.1109/jstars.2023.3234161
A CNN-Transformer Hybrid Model Based on CSWin Transformer for UAV Image Object Detection
  • Jan 1, 2023
  • IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
  • Wanjie Lu + 6 more

The object detection of unmanned aerial vehicle (UAV) images has widespread applications in numerous fields; however, the complex background, diverse scales, and uneven distribution of objects in UAV images make object detection a challenging task. This study proposes a convolution neural network transformer hybrid model to achieve efficient object detection in UAV images, which has three advantages that contribute to improving object detection performance. First, the efficient and effective cross-shaped window (CSWin) transformer can be used as a backbone to obtain image features at different levels, and the obtained features can be input into the feature pyramid network to achieve multiscale representation, which will contribute to multiscale object detection. Second, a hybrid patch embedding module is constructed to extract and utilize low-level information such as the edges and corners of the image. Finally, a slicing-based inference method is constructed to fuse the inference results of the original image and sliced images, which will improve the small object detection accuracy without modifying the original network. Experimental results on public datasets illustrate that the proposed method can improve performance more effectively than several popular and state-of-the-art object detection methods.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.jag.2024.103871
Application of an improved U-Net with image-to-image translation and transfer learning in peach orchard segmentation
  • May 2, 2024
  • International Journal of Applied Earth Observation and Geoinformation
  • Jiayu Cheng + 7 more

Application of an improved U-Net with image-to-image translation and transfer learning in peach orchard segmentation

  • Conference Article
  • Cite Count Icon 60
  • 10.1109/rsip.2017.7958795
Fast vehicle detection in UAV images
  • May 1, 2017
  • Tianyu Tang + 4 more

Fast and accurate vehicle detection in unmanned aerial vehicle (UAV) images remains a challenge, due to its very high spatial resolution and very few annotations. Although numerous vehicle detection methods exist, most of them cannot achieve real-time detection for different scenes. Recently, deep learning algorithms has achieved fantastic detection performance in computer vision, especially regression based convolutional neural networks YOLOv2. It's good both at accuracy and speed, outperforming other state-of-the-art detection methods. This paper for the first time aims to investigate the use of YOLOv2 for vehicle detection in UAV images, as well as to explore the new method for data annotation. Our method starts with image annotation and data augmentation. CSK tracking method is used to help annotate vehicles in images captured from simple scenes. Subsequently, a regression based single convolutional neural network YOLOv2 is used to detect vehicles in UAV images. To evaluate our method, UAV video images were taken over several urban areas, and experiments were conducted on this dataset and Stanford Drone dataset. The experimental results have proven that our data preparation strategy is useful, and YOLOv2 is effective for real-time vehicle detection of UAV video images.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.buildenv.2024.111705
Comparison of urban physical environments and thermal properties extracted from unmanned aerial vehicle images and ENVI-met model simulations
  • Jun 4, 2024
  • Building and Environment
  • Bonggeun Song + 3 more

Comparison of urban physical environments and thermal properties extracted from unmanned aerial vehicle images and ENVI-met model simulations

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.