Multiple object detection and tracking from drone videos based on GM-YOLO and multi-tracker

Yubin Yuan,Yiquan Wu,Langyue Zhao,Huixian Chen,Yao Zhang

doi:10.1016/j.imavis.2024.104951

Abstract

Multiple object tracking in drone videos is a vital vision task with broad application prospects, but most trackers use spatial or appearance clues alone to correlate detections. Our proposed Multi-Tracker uses a novel similarity measure that combines position and appearance information. We designed the GM-YOLO network to provide high-quality detections as input to Multi-Tracker. Add a Coordinate Attention mechanism and a weighted Bidirectional Feature Pyramid Network structure to the Backbone, each feature point's effective receptive field is modeled as a Gaussian distribution. To accurately obtain the motion and appearance features of the object, the adaptive noise covariance Kalman filter is used to get the position information, MB-OSNet network is designed to use global features to learn contour information to retrieve images from a wider field of view while incorporating Part-Level elements that contain more fine-grained data. Finally, the motion and appearance features are jointly compared to realize multi object tracking. The performance of the GM-YOLO object detector and the Multi-Tracker was verified on the VisDrone MOT and UAVDT datasets.

Full Text