Abstract
RGBT target tracking is a significant downstream task in the field of object tracking. However, compared to visible light target tracking, RGBT target tracking faces the challenge of smaller datasets, making it difficult to achieve performance levels comparable to those achieved in visible light target tracking. To address how to effectively combine the complementary characteristics of visible and thermal modalities, as well as how to fully leverage the superior performance of models trained on visible light target tracking tasks, while also aiming for lower computational costs and higher tracking effectiveness, a dual-prompt complementary fusion strategy for an RGBT tracking network is proposed. Drawing on the concept of prompt learning, this network aims to extend the efficient performance of visible light target tracking to the RGBT target tracking domain. In its implementation, the prompt module inputs both visible and thermal modality information as dual prompts into the backbone network, where the network utilizes these prompts to generate new, enriched prompt information at each layer. Subsequently, an information enhancement fusion module enhances the acquired prompt information and refeeds it into the backbone network, aiming to improve the tracking accuracy and robustness. Experimental results on GTOT, RGBT234 and LasHeR datasets show that the tracking accuracy (PR) and success rate (SR) of the network reach 93.1%/76.8%, 84.4%/62.4% and 66.8%/53.8%, respectively, which is improved compared with the current mainstream RGBT target tracking network, which verifies the effectiveness of the network.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have