Abstract

Although deep-learning-based multi-object tracking (MOT) approaches have achieved remarkable performances in terms of accuracy and efficiency, the issue of object occlusions remains an open challenge for most one-shot MOT methods. To address the problem of object occlusions, in this paper we present a recurrent across-channel and spatial attention-based one-shot multi-object tracking method with block-erasing data augmentation. First, we construct a multiattention feature learning module, named RASFL, that combines recurrent across-channel attention with spatial attention. The RASFL extracts both the correlations of the feature channels and the differences of the spatial locations to improve the accuracy of the re-identification (Re-ID) task. Second, we adopt a block- erasing data augmentation strategy to handle object occlusions by using random pixel blocks to simulate occlusion cases during the network training process. This block-erasing data augmentation assists the network to be more robust under object occlusions. By integrating the proposed RASFL module and the block-erasing data augmentation strategy into a one-shot online MOT system, we build an accurate and robust MOT model called DcMOT. Finally, we run our method on the MOT16, MOT17 and MOT20 datasets to conduct a comprehensive comparison with some of the state-of-the-art MOT methods. The experimental results demonstrate that the proposed DcMOT model achieves a competitive performance in terms of both accuracy and efficiency; with especially good performances in the occlusion cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call