Abstract

Most popular graph attention networks treat pixels of a feature map as individual nodes, which makes the feature embedding extracted by the graph convolution lack the integrity of the object. Moreover, matching between a template graph and a search graph using only part-level information usually causes tracking errors, especially in occlusion and similarity situations. To address these problems, we propose a novel end-to-end graph attention tracking framework that has high symmetry, combining traditional cross-correlation operations directly. By utilizing cross-correlation operations, we effectively compensate for the dispersion of graph nodes and enhance the representation of features. Additionally, our graph attention fusion model performs both part-to-part matching and global matching, allowing for more accurate information embedding in the template and search regions. Furthermore, we optimize the information embedding between the template and search branches to achieve better single-object tracking results, particularly in occlusion and similarity scenarios. The flexibility of graph nodes and the comprehensiveness of information embedding have brought significant performance improvements in our framework. Extensive experiments on three challenging public datasets (LaSOT, GOT-10k, and VOT2016) show that our tracker outperforms other state-of-the-art trackers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call