For tracking suspicious objects using intelligent robots, Multiple Object Tracking (MOT) has gained great attention. MOT is easily affected by long-term severe occlusion. This work proposes a joint MOT algorithm to handle such occlusion. Pairs of frames in complicated environments are taken as input. A center-based feature extraction framework is designed for precisely detecting objects and extracting their feature maps. A ConvGRU module is applied to learn permanent representations by using historical spatio-temporal information of objects. A Hungarian matching method is applied to match the detected objects and predicted predictions. The proposed algorithm is compared with several representative methods on two public multi-object tracking benchmarks. Furthermore, this work constructs a database with videos captured from street scenarios and uses it to test the proposed algorithm and its peers. Experimental results demonstrate that the proposed algorithm outperforms its peers, especially under long-term severe occlusion, thus advancing the field of MOT.
Read full abstract