Abstract

Category level 6D object pose estimation aims to predict the rotation, translation and size of object instances in any scene. In current research methods, global average pooling (first-order) is usually used to explore geometric features, which can only capture the first-order statistical information of the features and do not fully utilize the potential of the network. In this work, we propose a new high-order pose estimation network (HoPENet), which enhances feature representation by collecting high-order statistics to model high-order geometric features at each stage of the network. HoPENet introduces a global high-order enhancement module and utilizes global high-order pooling operations to capture the correlation between features and fuse global information. In addition, this module can capture long-term statistical correlations and make full use of contextual information. The entire network finally obtains a more discriminative feature representation. Experiments on two benchmarks, the virtual dataset CAMERA25 and the real dataset REAL275, demonstrate the effectiveness of HoPENet, achieving state-of-the-art (SOTA) pose estimation performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call