Abstract

Multiple Object Tracking (MOT) meets great challenges in videos captured by Unmanned Aerial Vehicles (UAVs). Different from traditional videos, due to high altitude and abrupt motion changes of UAVs, the sizes of target objects in UAVs videos are usually very small and the appearance information of target objects is unreliable. The motion analysis is meaningful to associate multiple objects in UAV videos. However, the traditional motion analysis models inevitably suffer from the autonomous motion of UAVs. In this paper, we proposed a Conditional Generative Adversarial Networks (GAN) based model to predict complex motions in UAV videos. We regard the objects motions and the UAV movement as the individual motions and global motions respectively. They are complementary with each other and are employed jointly to facilitate accurate motion prediction. Specifically, a social Long Short Term Memory network is exploited to estimate the individual motion of objects, and a Siamese network is constructed to generate the global motion to reflect the view changes from UAVs, and a conditional GAN is developed to generate the final motion affinity. Extensive experimental results are conducted on public UAV datasets contained various types of objects and 4 different kinds of object detection inputs. Robust motion prediction and improved MOT performance are achieved compared with state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call