Abstract
Multiple object tracking (MOT) from unmanned aerial vehicle (UAV) videos has faced several challenges such as motion capture and appearance, clustering, object variation, high altitudes, and abrupt motion. Consequently, the volume of objects captured by the UAV is usually quite small, and the target object appearance information is not always reliable. To solve these issues, a new technique is presented to track objects based on a deep learning technique that attains state-of-the-art performance on standard datasets, such as Stanford Drone and Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking (UAVDT) datasets. The proposed faster RCNN (region-based convolutional neural network) framework was enhanced by integrating a series of activities, including the proper calibration of key parameters, multi-scale training, hard negative mining, and feature collection to improve the region-based CNN baseline. Furthermore, a deep quadruplet network (DQN) was applied to track the movement of the captured objects from the crowded environment, and it was modelled to utilize new quadruplet loss function in order to study the feature space. A deep 6 Rectified linear units (ReLU) convolution was used in the faster RCNN to mine spatial–spectral features. The experimental results on the standard datasets demonstrated a high performance accuracy. Thus, the proposed method can be used to detect multiple objects and track their trajectories with a high accuracy.
Highlights
Accepted: 16 March 2021Object tracking [1] is a significant task in computer vision applications based on unmanned aerial vehicles (UAVs)
The proposed tracking technique quadruplet-based faster region-based convolutional neural network (RCNN) was tested with existing algorithms, such as Bayesian multi-object tracking (BMOT) [34], intersection-overunion tracker (IOUT) [15], global optimal greedy (GOG) algorithm [35], continuous energy minimization (CEM) [36], social long short term memory (SLSTM) [37], simple online and real-time tracking (SORT) [21], relative long short term memory (RLSTM) [38], and relative motion online tracking (RMOT) [39]
To examine the performance of the multiple object tracking (MOT) techniques, we utilized numerous metrics such as identification precision (IDP), identification recall (IDR), and F1 score, which are together referred as IDF1
Summary
Object tracking [1] is a significant task in computer vision applications based on unmanned aerial vehicles (UAVs). Compared to single object tracking (SOT), the task of multiple object tracking (MOT) has to develop the trajectories of all the objects in a precise scene of video surveillance [1,2]. Online MOT two-dimensional space is a complex task when there are similar objects [3]. With the enhancement of object detection methods such as the single-shot detector (SSD), the faster region-based convolutional neural network (RCNN), and the deformable part model (DPM), tracking by a prediction framework possesses high performances for MOT because prediction can provide object location and object trajectories. The above problems are more difficult in MOT than in SOT [11]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.