Abstract

Object tracking in thermal imagery is a challenging problem relevant to more and more growing applications. By fusing the complementary features of RGB and thermal images, the object tracking algorithms can be enhanced to give better outputs. mfSiamTrack (Multi-modality fusion for Siamese Network based RGB-T Tracking) is a dual-mode STSO (Short Term Single Object) tracker. The tracker works in Thermal Mode and Multi-modality fusion mode (RGBT mode). The RGBT mode gets activated if the dataset contains the Thermal Infrared and the corresponding RGB sequences. The complementary features from both RGB and Thermal Imagery are fused, and tracking uses fused sequences. If only thermal sequences exist in the dataset, the tracker works in thermal tracking mode. An auto-encoder (AE) based fusion network is proposed for multi-modality fusion. The Encoder decomposes the RGB and thermal images into the background and detail feature maps. The background and detail feature maps of the source images are fused by the Fusion Layer and the Decoder reconstructs the fused image. For handling objects at different scales and viewpoints, mfSiamTrack introduces a Multi-Scale Structural Similarity (MS-SSIM) based reconstruction method. mfSiamTrack is a fully convolutional-siamese network based tracker which also incorporates a semi-supervised video object segmentation (VOS) for pixel-wise target identification. The tracker was evaluated on the VOT-RGBT2019 dataset with Accuracy, Robustness, Expected Average Overlap (EAO) and Average IoU as performance evaluation measures. It is observed that mfSiamTrack outperforms the state-of-the-art.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call