Abstract

Multiple object tracking in drone videos is a vital vision task with broad application prospects, but most trackers use spatial or appearance clues alone to correlate detections. Our proposed Multi-Tracker uses a novel similarity measure that combines position and appearance information. We designed the GM-YOLO network to provide high-quality detections as input to Multi-Tracker. Add a Coordinate Attention mechanism and a weighted Bidirectional Feature Pyramid Network structure to the Backbone, each feature point's effective receptive field is modeled as a Gaussian distribution. To accurately obtain the motion and appearance features of the object, the adaptive noise covariance Kalman filter is used to get the position information, MB-OSNet network is designed to use global features to learn contour information to retrieve images from a wider field of view while incorporating Part-Level elements that contain more fine-grained data. Finally, the motion and appearance features are jointly compared to realize multi object tracking. The performance of the GM-YOLO object detector and the Multi-Tracker was verified on the VisDrone MOT and UAVDT datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call