Abstract
Single object tracking (SOT) is one of the most active research directions in the field of computer vision. Compared with the 2-D image-based SOT which has already been well-studied, SOT on 3-D point clouds is a relatively emerging research field. In this article, a novel approach, namely, the contextual-aware tracker (CAT), is investigated to achieve a superior 3-D SOT through spatially and temporally contextual learning from the LiDAR sequence. More precisely, in contrast to the previous 3-D SOT methods merely exploiting point clouds in the target bounding box as the template, CAT generates templates by adaptively including the surroundings outside the target box to use available ambient cues. This template generation strategy is more effective and rational than the previous area-fixed one, especially when the object has only a small number of points. Moreover, it is deduced that LiDAR point clouds in 3-D scenes are often incomplete and significantly vary from frame to another, which makes the learning process more difficult. To this end, a novel cross-frame aggregation (CFA) module is proposed to enhance the feature representation of the template by aggregating the features from a historical reference frame. Leveraging such schemes enables CAT to achieve a robust performance, even in the case of extremely sparse point clouds. The experiments confirm that the proposed CAT outperforms the state-of-the-art methods on both the KITTI and NuScenes benchmarks, achieving 3.9% and 5.6% improvements in terms of precision.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on neural networks and learning systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.