Hypergraph-Based Multi-Modal Representation for Open-Set 3D Object Retrieval.

Yifan Feng,Yue Gao,Yu-Shen Liu,Shuyi Ji,Qionghai Dai,Shaoyi Du

doi:10.1109/tpami.2023.3332768

Abstract

The traditional 3D object retrieval (3DOR) task is under the close-set setting, which assumes the categories of objects in the retrieval stage are all seen in the training stage. Existing methods under this setting may tend to only lazily discriminate their categories, while not learning a generalized 3D object embedding. Under such circumstances, it is still a challenging and open problem in real-world applications due to the existence of various unseen categories. In this paper, we first introduce the open-set 3DOR task to expand the applications of the traditional 3DOR task. Then, we propose the Hypergraph-Based Multi-Modal Representation (HGM 2 R) framework to learn 3D object embeddings from multi-modal representations under the open-set setting. The proposed framework is composed of two modules, i.e., the Multi-Modal 3D Object Embedding (MM3DOE) module and the Structure-Aware and Invariant Knowledge Learning (SAIKL) module. By utilizing the collaborative information of modalities derived from the same 3D object, the MM3DOE module is able to overcome the distinction across different modality representations and generate unified 3D object embeddings. Then, the SAIKL module utilizes the constructed hypergraph structure to model the high-order correlation among 3D objects from both seen and unseen categories. The SAIKL module also includes a memory bank that stores typical representations of 3D objects. By aligning with those memory anchors in the memory bank, the aligned embeddings can integrate the invariant knowledge to exhibit a powerful generalized capacity toward unseen categories. We formally prove that hypergraph modeling has better representative capability on data correlation than graph modeling. We generate four multi-modal datasets for the open-set 3DOR task, i.e., OS-ESB-core, OS-NTU-core, OS-MN40-core, and OS-ABO-core, in which each 3D object contains three modality representations: multi-view, point clouds, and voxel. Experiments on these four datasets show that the proposed method can significantly outperform existing methods. In particular, the proposed method outperforms the state-of-the-art by 12.12%/12.88% in terms of mAP on the OS-MN40-core/OS-ABO-core dataset, respectively. Results and visualizations demonstrate that the proposed method can effectively extract the generalized 3D object embeddings on the open-set 3DOR task and achieve satisfactory performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hypergraph-Based Multi-Modal Representation for Open-Set 3D Object Retrieval.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Apr 1, 2024
Citations: 3

Similar Papers

SHREC’22 track: Open-Set 3D Object Retrieval
Yifan Feng ...
Computers & Graphics | VOL. 107
Yifan Feng, et. al.Yifan Feng ...
27 Jul 2022
Computers & Graphics | VOL. 107

CNN-based 3D object classification using Hough space of LiDAR point clouds
Wei Song ... Amanda Gozho
Human-centric Computing and Information Sciences | VOL. 10
Wei Song, et. al.Wei Song ... Amanda Gozho
07 May 2020
Human-centric Computing and Information Sciences | VOL. 10

3D Object Recognition Method Based on Point Cloud Sequential Coding
Shuai Dong ... Kun Zou
-
Shuai Dong, et. al.Shuai Dong ... Kun Zou
15 Feb 2020
15 Feb 2020

Multi-View 3D Object Retrieval using Autoencoder & Deep Embedding Network
Sakifa Aktar ... Md Al Mamun
-
Sakifa Aktar, et. al.Sakifa Aktar ... Md Al Mamun
01 Feb 2019
01 Feb 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hypergraph-Based Multi-Modal Representation for Open-Set 3D Object Retrieval.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence