Anti-Distractor Active Object Tracking in 3D Environments

Mao Xi,Wengang Zhou,Yun Zhou,Houqiang Li,Zheng Chen

doi:10.1109/tcsvt.2021.3107153

Abstract

In active object tracking, given a visual observation as input, the goal is to lockup the target by autonomously adjusting camera’s position and posture. Previous works on active tracking assume that there is only one object (person) in the environment without distractors. In this work, towards realistic setting, we move forward to a more challenging scenario, where the tracker moves freely in 3D space like unmanned aerial vehicles (UAV) to track a person in various complex scenes with multiple distractors. To this end, we propose a novel end-to-end anti-distractor active object tracking framework by introducing multiple attention modules. On one hand, we take the target template to learn an embedding as channel-wise attention for current observation to distinguish the target from the distractors. On the other hand, temporal attention is introduced to fuse the observation history to extract a feature representation, which is then fed into a reinforcement learning network to output the action of the tracker. To evaluate our method, we build several multi-object 3D environments in Unreal Engine and extensive experiments demonstrate the effectiveness of our approach.

Full Text