Abstract
Although the RGB image has a high spatial resolution, it only depicts color intensities in red, green, and blue channels, which easily leads to the failure of the tracker based on RGB modality in some challenging scenarios, for example, when the color of the object and background is similar. The hyperspectral image with rich spectral information is more robust in these difficult situations, so it is essential to explore how to effectively apply hyperspectral features to supplement RGB information in object tracking. However, there is no fusion tracking algorithm based on hyperspectral and RGB data. Based on this, we propose a novel fusion tracking framework of hyperspectral and RGB in this article, termed as Transformer-based Fusion Tracking Network (TFTN), to enhance the performance of object tracking. Within the framework, we construct a dual-branch structure based on the Siamese Network to obtain the modality-specific representations of different modality images. Besides, the framework is generic, which is suitable for the Siamese series of tracking algorithms. In addition, we design a Siamese three-dimensional convolutional neural network as the specific branch of hyperspectral modality for synchronous extraction of the spatial and spectral features of hyperspectral data, to give full play to the role of hyperspectral data in improving network tracking performance. Particularly, inspired by the structure of Transformer, we design a Transformer-based fusion module to capture the potential interaction of intra-modality and inter-modality features of different modalities. This is the first work that combines the information of hyperspectral and RGB modalities to improve tracking performance. At the same time, it is also the first time that employs the self-attention module of Transformer to combine the information of different modalities for multi-modality fusion tracking. Experimental results on the dataset composed of hyperspectral and RGB image sequences show that the proposed TFTN tracker is superior to the state-of-the-art trackers, demonstrating the effectiveness of this method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Geoscience and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.