Multi-Range View Aggregation Network With Vision Transformer Feature Fusion for 3D Object Retrieval

Dongyun Lin,Aiyuan Guo,Yi Cheng,Yiqun Li,Yanpeng Cao,Shitala Prasad

doi:10.1109/tmm.2023.3246229

Abstract

View-based methods have achieved state-of-the-art performance in 3D object retrieval. However, view-based methods still encounter two major challenges. The first is how to leverage the inter-view correlation to enhance view-level visual features. The second is how to effectively fuse view-level features into a discriminative global descriptor. Towards these two challenges, we propose a multi-range view aggregation network (MRVANet) with a vision transformer based feature fusion scheme for 3D object retrieval. Unlike the existing methods which only consider aggregating neighboring or adjacent views which could bring in redundant information, we propose a multi-range view aggregation module to enhance individual view representations through view aggregation beyond only neighboring views but also incorporate the views at different ranges. Furthermore, to generate the global descriptor from view-level features, we propose to employ the multi-head self-attention mechanism introduced by vision transformer to fuse the view-level features. Extensive experiments conducted on three public datasets including ModelNet40, ShapeNet Core55 and MCB-A demonstrate the superiority of the proposed network over the state-of-the-art methods in 3D object retrieval.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Range View Aggregation Network With Vision Transformer Feature Fusion for 3D Object Retrieval

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2023
Citations: 3

Similar Papers

Multiple Discrimination and Pairwise CNN for view-based 3D object retrieval
Zan Gao ... Shaohua Wan
Neural Networks | VOL. 125
Zan Gao, et. al.Zan Gao ... Shaohua Wan
29 Feb 2020
Neural Networks | VOL. 125

Optimized hybrid shape descriptor-based 3D ojbect retrieval
Sang Min Yoon ... Gang-Joon Yoon
-
Sang Min Yoon, et. al.Sang Min Yoon ... Gang-Joon Yoon
01 Sep 2013
01 Sep 2013

A Review on Deep Learning Approaches for 3D Data Representations in Retrieval and Classifications
Abubakar Sulaiman Gezawa ... Yan Zhang
IEEE Access | VOL. 8
Abubakar Sulaiman Gezawa, et. al.Abubakar Sulaiman Gezawa ... Yan Zhang
01 Jan 2020
IEEE Access | VOL. 8

3D object retrieval with multimodal views
...
-
, et. al. ...
02 May 2015
02 May 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Range View Aggregation Network With Vision Transformer Feature Fusion for 3D Object Retrieval

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia