DAR-Net: Dense Attentional Residual Network for Vehicle Detection in Aerial Images.

Kaifeng Li,Bin Wang

doi:10.1155/2021/6340823

Abstract

With the rapid development of deep learning and the wide usage of Unmanned Aerial Vehicles (UAVs), CNN-based algorithms of vehicle detection in aerial images have been widely studied in the past several years. As a downstream task of the general object detection, there are some differences between the vehicle detection in aerial images and the general object detection in ground view images, e.g., larger image areas, smaller target sizes, and more complex background. In this paper, to improve the performance of this task, a Dense Attentional Residual Network (DAR-Net) is proposed. The proposed network employs a novel dense waterfall residual block (DW res-block) to effectively preserve the spatial information and extract high-level semantic information at the same time. A multiscale receptive field attention (MRFA) module is also designed to select the informative feature from the feature maps and enhance the ability of multiscale perception. Based on the DW res-block and MRFA module, to protect the spatial information, the proposed framework adopts a new backbone that only downsamples the feature map 3 times; i.e., the total downsampling ratio of the proposed backbone is 8. These designs could alleviate the degradation problem, improve the information flow, and strengthen the feature reuse. In addition, deep-projection units are used to reduce the impact of information loss caused by downsampling operations, and the identity mapping is applied to each stage of the proposed backbone to further improve the information flow. The proposed DAR-Net is evaluated on VEDAI, UCAS-AOD, and DOTA datasets. The experimental results demonstrate that the proposed framework outperforms other state-of-the-art algorithms.

Highlights

Object detection, as an important topic in computer vision, aims to precisely localize the targets in given images and classify each target. is topic is of broad interest for potential applications of face detection, pedestrian counting, automatic driving, vehicle detection, etc. [1].Before the emergence of deep learning, most traditional object detection algorithms which are based on hand-crafted features can be roughly divided into three steps: region selection, feature vector extraction, and region classification
Deeper residual networks do have the advantages in exploring deeper features that contain rich semantic information, the spatial information contained in shallower features is corrupted and lost during the processing
The 32 strides’ downsampling ratio will lead to the loss of spatial information, which is harmful for object localization, especially for relatively small object localization in aerial images. Algorithms such as YOLOv2 [17] or Feature Pyramid Networks (FPN) [10] keep shallow spatial information by skip-connection or feature fusion. ese methods can only alleviate the problem; it can not solve the problem. For these reasons, based on the proposed attentional dense waterfall residual block, a backbone designed for vehicle detection in aerial images is proposed. e proposed backbone preserves the spatial information from the following aspects

Summary

Introduction

As an important topic in computer vision, aims to precisely localize the targets in given images and classify each target. is topic is of broad interest for potential applications of face detection, pedestrian counting, automatic driving, vehicle detection, etc. [1].Before the emergence of deep learning, most traditional object detection algorithms which are based on hand-crafted features can be roughly divided into three steps: region selection, feature vector extraction, and region classification. Object detection algorithms based on traditional manual features have made some breakthroughs in detection accuracy, there still are two nonignorable limitations. They inevitably generate many redundant candidate regions during region proposal steps, which leads to imbalanced class distribution during region classification steps. Hand-crafted feature extraction algorithms are not capable of capturing high-level semantic information; besides the low-level information it extracted is not sufficient for complex localization and classification problems. Because of these limitations, traditional object detection algorithms are generally time-consuming and inaccurate

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Intelligence and Neuroscience	Publication Date: Jan 1, 2021
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DAR-Net: Dense Attentional Residual Network for Vehicle Detection in Aerial Images.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience

Lead the way for us

Similar Papers

Robust Vehicle Detection in Aerial Images Based on Image Spatial Pyramid Detection Model
Xianghui Li ... Xinde Li
-
Xianghui Li, et. al.Xianghui Li ... Xinde Li
01 Jul 2019
01 Jul 2019

Vehicle Detection in High-Resolution Aerial Images with Parallel RPN and Density-Assigner
Xianghui Kong ... Yan Zhang
Remote Sensing | VOL. 15
Xianghui Kong, et. al.Xianghui Kong ... Yan Zhang
19 Mar 2023
Remote Sensing | VOL. 15

A feature fusion deep-projection convolution neural network for vehicle detection in aerial images.
Bin Wang ... Bin Xu
PloS one | VOL. 16
Bin Wang, et. al.Bin Wang ... Bin Xu
07 May 2021
PloS one | VOL. 16

Vehicle detection in aerial images
Georgy Dorrer ... Maksim Koriukin
IOP Conference Series: Earth and Environmental Science | VOL. 315
Georgy Dorrer, et. al.Georgy Dorrer ... Maksim Koriukin
01 Aug 2019
IOP Conference Series: Earth and Environmental Science | VOL. 315

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DAR-Net: Dense Attentional Residual Network for Vehicle Detection in Aerial Images.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Intelligence and Neuroscience