Sparse mixed attention aggregation network for multimodal images fusion tracking

Mingzheng Feng,Jianbo Su

doi:10.1016/j.engappai.2023.107273

Abstract

Recent years have witnessed the exciting performance of trackers based on Transformer. However, they usually separate the process of information extraction and integration, weakening the information interaction between the target and search region. In addition, they depend on traditional Transformer to model the long range dependency, which leads to a lack of focus on the primary information needed by high-accuracy trackers. In this paper, a sparse mixed attention aggregation model is proposed for robust tracking based on visible and thermal infrared images. To be specific, a backbone network composed of sparse mixed attention is designed to achieve information extraction and integration. This is helpful to obtain specific discriminative feature information and enhance their communication. To give full play to the complementary visible and thermal information, a confidence aware aggregation network is designed, which can learn the reliable confidence of visible and thermal branches. Finally, a corner-based localization head is introduced to estimate the target state. Extensive experiments on three large-scale multimodal tracking benchmarks demonstrate the superior tracking ability of the proposed tracker over other advanced trackers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sparse mixed attention aggregation network for multimodal images fusion tracking

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence

Lead the way for us

Journal: Engineering Applications of Artificial Intelligence	Publication Date: Oct 14, 2023
Citations: 2

Similar Papers

A Fast Template Matching Scheme of Visible and Infrared Image Under Occluded Scenarios
Lichun Mei ... Jun Zhang
IEEE Access | VOL. 10
Lichun Mei, et. al.Lichun Mei ... Jun Zhang
01 Jan 2021
IEEE Access | VOL. 10

MixFormer: End-to-End Tracking with Iterative Mixed Attention
Yutao Cui ... Cheng Jiang
-
Yutao Cui, et. al.Yutao Cui ... Cheng Jiang
01 Jun 2022
01 Jun 2022

Estimates of rice lodging using indices derived from UAV visible and thermal infrared images
Tao Liu ... Wenshan Guo
Agricultural and Forest Meteorology | VOL. 252
Tao Liu, et. al.Tao Liu ... Wenshan Guo
19 Feb 2018
Agricultural and Forest Meteorology | VOL. 252

Spontaneous facial expression recognition by using feature-level fusion of visible and thermal infrared images
Zhaoyu Wang ... Shangfei Wang
-
Zhaoyu Wang, et. al. Zhaoyu Wang ... Shangfei Wang
01 Sep 2011
01 Sep 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sparse mixed attention aggregation network for multimodal images fusion tracking

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence