Abstract
RGB-T tracking has been widely applied in various fields such as robotics, surveillance processing, and autonomous driving. In contrast, the development of RGB-D tracking remains relatively slow. Existing RGB-D trackers still adopt the paradigm of extracting spatial information between the target template and the search region for appearance feature matching. However, this paradigm can not capture target appearance variations in traditional RGB tracking, let alone in more complex multimodal object tracking datasets. To enhance the performance of RGB-D trackers, we propose a novel temporal adaptive bidirectional bridging framework for RGB-D tracking named TABBTrack. TABBTrack employs a three-stream architecture with temporal features and uses a bidirectional bridging module to iteratively bridge RGB and Depth modality information, fully achieving cross-modal feature interaction. In addition, we also achieve temporal information update with low-complexity. Extensive experiments on four popular RGB-D tracking benchmarks demonstrate that our method achieves state-of-the-art performance while running at real-time speeds.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.