Object Detection from the Video Taken by Drone via Convolutional Neural Networks

Chenfan Sun,Jinhiu She,Yangyang Zhang,Wei Zhan,Liangtian Wan

doi:10.1155/2020/4013647

Chenfan Sun, Jinhiu She + Show 3 more

Open Access

https://doi.org/10.1155/2020/4013647

Copy DOI

Journal: Mathematical Problems in Engineering	Publication Date: Oct 13, 2020
Citations: 17	License type: CC BY 4.0

Affiliation: Yangtze University

Abstract

The aim of this research is to show the implementation of object detection on drone videos using TensorFlow object detection API. The function of the research is the recognition effect and performance of the popular target detection algorithm and feature extractor for recognizing people, trees, cars, and buildings from real-world video frames taken by drones. The study found that using different target detection algorithms on the “normal” image (an ordinary camera) has different performance effects on the number of instances, detection accuracy, and performance consumption of the target and the application of the algorithm to the image data acquired by the drone is different. Object detection is a key part of the realization of any robot’s complete autonomy, while unmanned aerial vehicles (UAVs) are a very active area of this field. In order to explore the performance of the most advanced target detection algorithm in the image data captured by UAV, we have done a lot of experiments to solve our functional problems and compared two different types of representative of the most advanced convolution target detection systems, such as SSD and Faster R-CNN, with MobileNet, GoogleNet/Inception, and ResNet50 base feature extractors.

Highlights

An object recognition system uses a priori known object model to find real-world pairs from images of the world [1, 2]
Applying computer vision (CV) and machine learning (ML), it is a hot area of research in robotics
In the Faster R-CNN [13] model, the detection process is divided into two stages (Figure 2). e first stage is called as region proposal network (RPN), images are processed by a feature extractor, and features at some selected intermediate level (e.g., “conv5”) are used to predict classagnostic box proposals. e last stage is to crop features from these box proposals, and it is fed to the remainder of the feature extractor (e.g., “fc6” followed by “fc7”)

Summary

Introduction

An object recognition system uses a priori known object model to find real-world pairs from images of the world [1, 2]. It cannot be assumed that object detection algorithms normally used on “normal” images perform well on taken by drone images Previous works on this stress that the images captured by a drone often are different from those available for training, which are often taken by a hand-held camera. Difficulties in detecting objects in data from a drone may arise due to the positioning [9, 10] of the camera compared to images taken by a human, depending on what type of images the network is trained on. The aim of work was to show whether a network trained on normal camera images could be used on images taken by a drone with satisfactory results.

Why Choose TensorFlow Object Detection API

Meta-architectures

Experiment Results

Analyses

15 MobileNet