Abstract

In active object tracking, given a visual observation as input, the goal is to lockup the target by autonomously adjusting camera’s position and posture. Previous works on active tracking assume that there is only one object (person) in the environment without distractors. In this work, towards realistic setting, we move forward to a more challenging scenario, where the tracker moves freely in 3D space like unmanned aerial vehicles (UAV) to track a person in various complex scenes with multiple distractors. To this end, we propose a novel end-to-end anti-distractor active object tracking framework by introducing multiple attention modules. On one hand, we take the target template to learn an embedding as channel-wise attention for current observation to distinguish the target from the distractors. On the other hand, temporal attention is introduced to fuse the observation history to extract a feature representation, which is then fed into a reinforcement learning network to output the action of the tracker. To evaluate our method, we build several multi-object 3D environments in Unreal Engine and extensive experiments demonstrate the effectiveness of our approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.