Robust Correlation Tracking for UAV Videos via Feature Fusion and Saliency Proposals

Xizhe Xue,Qiang Shen,Ying Li,Hao Dong

doi:10.3390/rs10101644

Xizhe Xue, Qiang Shen + Show 2 more

Open Access

https://doi.org/10.3390/rs10101644

Copy DOI

Abstract

Following the growing availability of low-cost, commercially available unmanned aerial vehicles (UAVs), more and more research efforts have been focusing on object tracking using videos recorded from UAVs. However, tracking from UAV videos poses many challenges due to platform motion, including background clutter, occlusion, and illumination variation. This paper tackles these challenges by proposing a correlation filter-based tracker with feature fusion and saliency proposals. First, we integrate multiple feature types such as dimensionality-reduced color name (CN) and histograms of oriented gradient (HOG) features to improve the performance of correlation filters for UAV videos. Yet, a fused feature acting as a multivector descriptor cannot be directly used in prior correlation filters. Therefore, a fused feature correlation filter is proposed that can directly convolve with a multivector descriptor, in order to obtain a single-channel response that indicates the location of an object. Furthermore, we introduce saliency proposals as re-detector to reduce background interference caused by occlusion or any distracter. Finally, an adaptive template-update strategy according to saliency information is utilized to alleviate possible model drifts. Systematic comparative evaluations performed on two popular UAV datasets show the effectiveness of the proposed approach.

Highlights

Recent years have witnessed significant developments in computer vision
All methods mentioned above cannot cope well with challenges appearing in such videos, which typically involve illumination variation, background clutter, and occlusion. To address these issues we propose a robust tracking approach for unmanned aerial vehicles (UAVs) videos, which offers three main contributions: (1) Composed of the histogram of oriented gradient (HOG) and dimension-reduced color names (CN) features, fused features are introduced to correlation filter in order to improve the robustness of appearance model in describing the target
We aim to develop an online tracking algorithm that is adaptive to significant appearance change without being prone to drifting, in which the extracted fused features are encoded in terms of multivectors

Summary

Introduction

Recent years have witnessed significant developments in computer vision. An enormous amount of research effort has gone into vision-based tasks, such as object tracking [1,2,3,4,5,6] and saliency detection [7,8,9,10]. Color features, such as color names (CN), help capture rich color characteristics, and histogram of oriented gradient (HOG) [12] features are adept in capturing abundant gradient information. Based on these feature descriptions, a variety of techniques on target tracking have been proposed. It is difficult for them to meet the requirement of processing a large number of frames per second without resorting to parallel computation on a standard PC in dealing with real-time tasks [17] From this viewpoint, correlation filters [18,19,20,21,22] show their strengths both in speed and in accuracy, where tracking problem is converted from time domain to frequency domain with fast Fourier transform (FFT). Convolution can be substituted with multiplication in an effort to achieve fast learning and target detection

Objectives

Methods

Results