Related Topics
Articles published on Target Appearance Variations
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
71 Search results
Sort by Recency
- Research Article
- 10.1109/tip.2026.3674367
- Jan 1, 2026
- IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
- Zhaodong Ding + 3 more
Each sequence in existing RGBT tracking datasets is typically captured from a single platform equipped with both RGB (visible light) and TIR (thermal infrared) sensors. In real-world applications, tracking some objects requires cross-platform collaboration and these platforms might be equipped with different sensors. However, changes in modalities and platforms may cause significant variations in target appearance and abrupt position shifts, which existing RGBT trackers struggle to handle. To address these challenges, we define a new task, termed dynamic RGBT tracking, focusing on cross-platform and modality-variant scenarios. Considering the dynamic changes of modalities and platforms, we investigate dynamic RGBT tracking from a causal perspective, and assume that images consist of causal factors (target-relevant information) and non-causal factors (target-irrelevant information, i.e., modality/platform information), where only the former is conducive to stable tracking. Based on this assumption, we propose a novel causality-based modality&platform-invariant representation learning approach to capture robust invariant representations for dynamic RGBT tracking. In particular, to mitigate the challenges posed by modality variations, we design a causal consistency encoder that introduces an intervener to model feature uncertainty and simulate modal variations, compelling the model to focus on modality-invariant features to improve tracking robustness. To overcome the issue of abrupt view change and position shift, we design a platform-independent global searcher to re-localize the target whenever a platform switch occurs, which leverages an intervener to simulate the interference of platform changes on features, encouraging the searcher to learn platform-invariant representations for improved localization accuracy. In addition, to promote the research and development of dynamic RGBT tracking, we construct a dataset named DRGBT603, which consists of 603 sequences with a total of 1.49 M frame pairs. Extensive experiments on DRGBT603 dataset validate the effectiveness of the proposed method against other state-of-the-art methods. Our code and data are now available: https://github.com/dongdong2061/DRGBT.
- Research Article
1
- 10.3390/sym17091443
- Sep 3, 2025
- Symmetry
- Kun Wang + 2 more
Characters on liquid crystal display (LCD) interfaces often appear densely arranged, with complex image backgrounds and significant variations in target appearance, posing considerable challenges for visual detection. To improve the accuracy and robustness of character detection, this paper proposes an enhanced character detection algorithm based on the DBNet framework, named EBiDNet (EfficientNetV2 and BiFPN Enhanced DBNet). This algorithm integrates machine vision with deep learning techniques and introduces the following architectural optimizations. It employs EfficientNetV2-S, a lightweight, high-performance backbone network, to enhance feature extraction capability. Meanwhile, a bidirectional feature pyramid network (BiFPN) is introduced. Its distinctive symmetric design ensures balanced feature propagation in both top-down and bottom-up directions, thereby enabling more efficient multiscale contextual information fusion. Experimental results demonstrate that, compared with the original DBNet, the proposed EBiDNet achieves a 9.13% increase in precision and a 14.17% improvement in F1-score, while reducing the number of parameters by 17.96%. In summary, the proposed framework maintains lightweight design while achieving high accuracy and strong robustness under complex conditions.
- Research Article
- 10.1007/s44443-025-00144-w
- Jul 24, 2025
- Journal of King Saud University Computer and Information Sciences
- Huiwei Shi + 4 more
Abstract RGBT (visible-thermal) object tracking holds significant value in complex scenarios such as low-light and hazy environments, enabling robust all-weather tracking by leveraging the complementary strengths of visible and thermal infrared modalities. However, challenges such as target appearance variations, similar object interference, and camera motion often lead to tracking drift. This paper proposes RecheckTrack, a robust RGBT tracking framework that addresses these issues through the enhancement of temporal information and a backward trajectory verification mechanism. The dual-branch fusion network adaptively learns target dynamics using appearance tokens and modality tokens. Modality tokens focus on high-quality features and target-probable regions, while appearance tokens track dynamic changes in target appearance, improving robustness against deformation, occlusion, and scale variations. To mitigate drift caused by sudden target or camera motion, a recheck network is introduced, which employs a two-stage candidate box selection method and jointly matches targets using bidirectional tracking consistency and appearance similarity. Additionally, for long-term tracking scenarios where targets may be lost, the recheck network is improved with a path-consistency-based backward trajectory selection method and an approximate global search strategy, efficiently recovering lost targets. Experiments on the VTUAV, LasHeR, and RGBT234 datasets demonstrate that RecheckTrack significantly reduces tracking drift and improves accuracy, providing an effective solution for RGBT tracking in complex scenarios.
- Research Article
1
- 10.22581/muet1982.2832
- Oct 1, 2024
- Mehran University Research Journal of Engineering and Technology
- Mubashar Masood + 1 more
Target tracking via Correlation Filter (CF) is a hot research area of computer vision domain, and offers various credible benefits. Existing CF algorithms face challenges when there are target appearance variations due to background noise, scale and illumination changes, occlusion, and fast motion, which severely degrades the overall tracker performance. To get maximum benefits, an object tracker should perform well with the less computational burden in the presence of real time challenging situations. To address this issue, a novel visual object trackeris proposed based on multi feature fusion and adaptive learning technique with aberrance suppression. At first, multiple features i.e., Histogram of gradient (HOG), Color Naming (CN), saliency, and gray level intensities are combined using feature fusion technique. Further, based on the evaluation of final fused response map using Peak-to-Sidelobe Ratio (PSR), an adaptive learning strategy is integrated to improve the learning phase of tracker. Tracking results show that the proposed strategy beats the other modern CF trackers with Distance Precision (DP) scores of 88.2%, 85.9%, and 74.1% and 64.7% over OTB2013, OTB2015, and TempleColor128 and UAV123 datasets respectively.
- Research Article
7
- 10.1016/j.patcog.2024.111053
- Sep 27, 2024
- Pattern Recognition
- Ge Ying + 4 more
Temporal adaptive bidirectional bridging for RGB-D tracking
- Research Article
1
- 10.1016/j.eswa.2024.125083
- Aug 14, 2024
- Expert Systems With Applications
- Ruke Xiong + 4 more
Online discrimination-correction subnet for background suppression and dynamic template updating in Siamese visual tracking
- Research Article
34
- 10.1016/j.inffus.2024.102562
- Jul 2, 2024
- Information Fusion
- Ashish Kumar + 5 more
Correlation filter based single object tracking: A review
- Research Article
3
- 10.1016/j.ins.2023.118954
- Apr 20, 2023
- Information Sciences
- Huayue Cai + 5 more
Online intervention siamese tracking
- Research Article
41
- 10.1016/j.patcog.2023.109514
- Mar 9, 2023
- Pattern Recognition
- Yan Liu + 4 more
Bi-RRNet: Bi-level recurrent refinement network for camouflaged object detection
- Research Article
44
- 10.1016/j.ins.2022.12.082
- Jan 2, 2023
- Information Sciences
- Nana Fan + 4 more
Siamese residual network for efficient visual tracking
- Research Article
6
- 10.1109/access.2023.3279868
- Jan 1, 2023
- IEEE Access
- Zhiyuan Li + 2 more
Object tracking is a crucial research area within the field of intelligent transportation, providing a vital foundation for anomalous behavior analysis and traffic statistics. Although pedestrian detectors have shown impressive results, leading to the advancement of detection-based tracking methods, target association in complex scenarios remains a difficult and less efficient task due to the lack of feature robustness in the presence of partial occlusions. In the proposed tracking method, we extract convolutional features on each entire object and its local blocks, segmented by the superpixel algorithm. Aiming to emphasize the global and local information respectively, the global features for each entire object are extracted from the last layer of the backbone network, while local features are derived from a specific intermediate layer of the backbone network. The association between tracked targets and detected pedestrian candidates relies on fused similarity degrees. Furthermore, we use the transformer’s self-attention mechanism to predict features for the current frame based on the information within past frames, aiming to eliminate the effects of target appearance variations. Additionally, we remove redundant background pixels in the detected rectangles of pedestrian candidates by using a background modeling algorithm. Experimental results demonstrate that the tracker proposed in this paper outperforms other trackers across five publicly available datasets, indicating its effectiveness and potential for further development.
- Research Article
8
- 10.1016/j.neucom.2022.10.031
- Oct 14, 2022
- Neurocomputing
- Tianpeng Liu + 5 more
Visual tracking with dumbbell selection network
- Research Article
40
- 10.1109/tcsvt.2021.3094645
- May 1, 2022
- IEEE Transactions on Circuits and Systems for Video Technology
- Zikun Zhou + 4 more
Temporal and spatial contexts, characterizing target appearance variations and target-background differences, respectively, are crucial for improving the online adaptive ability and instance-level discriminative ability of object tracking. However, most existing trackers focus on either the temporal context or the spatial context during tracking and have not exploited these contexts simultaneously and effectively. In this paper, we propose a Spatial-TEmporal Memory (STEM) network to exploit these contexts jointly for object tracking. Specifically, we develop a key-value structured memory model equipped with a key-value index-based memory reading mechanism to model the spatial and temporal contexts simultaneously. To update the memory with new target states and ensure the diversity of the memory, we introduce a similarity-aware memory update scheme. In addition, we construct an entropy-guided ensemble strategy to fuse the prediction models based on these two contexts, such that these two contexts can be exploited to estimate the target state jointly. Extensive experimental results on eight challenging datasets, including OTB2015, TC128, UAV123, VOT2018, LaSOT, TrackingNet, GOT-10k, and OxUvA, demonstrate that the proposed method performs favorably against state-of-the-art trackers.
- Research Article
4
- 10.1016/j.jvcir.2022.103456
- Feb 16, 2022
- Journal of Visual Communication and Image Representation
- Jing Liu + 3 more
Tracking by dynamic template: Dual update mechanism
- Research Article
2
- 10.1016/j.sigpro.2022.108463
- Jan 17, 2022
- Signal Processing
- Lin Zhou + 4 more
Robust DCF object tracking with adaptive spatial and temporal regularization based on target appearance variation
- Research Article
- 10.2139/ssrn.4016128
- Jan 1, 2022
- SSRN Electronic Journal
- Long Xu + 2 more
Fast and Efficient Target Appearance Variations Consistency Tracking
- Research Article
2
- 10.3390/s21237790
- Nov 23, 2021
- Sensors (Basel, Switzerland)
- Hang Chen + 2 more
Recently, Siamese architecture has been widely used in the field of visual tracking, and has achieved great success. Most Siamese network based trackers aggregate the target information of two branches by cross-correlation. However, since the location of the sampling points in the search feature area is pre-fixed in cross-correlation operation, these trackers suffer from either background noise influence or missing foreground information. Moreover, the cross-correlation between the template and the search area neglects the geometry information of the target. In this paper, we propose a Siamese deformable cross-correlation network to model the geometric structure of target and improve the performance of visual tracking. We propose to learn an offset field end-to-end in cross-correlation. With the guidance of the offset field, the sampling in the search image area can adapt to the deformation of the target, and realize the modeling of the geometric structure of the target. We further propose an online classification sub-network to model the variation of target appearance and enhance the robustness of the tracker. Extensive experiments are conducted on four challenging benchmarks, including OTB2015, VOT2018, VOT2019 and UAV123. The results demonstrate that our tracker achieves state-of-the-art performance.
- Research Article
20
- 10.1016/j.infrared.2021.103825
- Jul 9, 2021
- Infrared Physics & Technology
- Tingting Yao + 5 more
Scale and appearance variation enhanced siamese network for thermal infrared target tracking
- Research Article
7
- 10.1016/j.imavis.2021.104181
- Apr 24, 2021
- Image and Vision Computing
- Lang Yu + 3 more
Online-adaptive classification and regression network with sample-efficient meta learning for long-term tracking
- Research Article
17
- 10.1016/j.neunet.2021.04.004
- Apr 16, 2021
- Neural Networks
- Nana Fan + 4 more
Learning dual-margin model for visual tracking