Abstract

3D shape retrieval is an important research topic in the field of modern multimedia information retrieval. Point cloud and mesh modalities are commonly used representations of 3D data and have strong shape description capabilities. However, existing multimodal 3D shape retrieval methods lack the fusion learning of these two irregular data. In this paper, we design a depthwise separable hypergraph convolution and build a multimodal fusion network base on it, which use hypergraph to model higher-order relationships between data and improve 3D shape retrieval capabily through the effective fusion of point cloud and mesh data. First, the initial feature descriptors of the point cloud and mesh modalities are extracted using a pretrained network, respectively. Next, perform channel shuffle on the initial feature descriptors to mix the multimadal data and then use the k-Nearest Neighbour(kNN) algorithm to construct corresponding hypergraphs. Finally, depthwise separable hypergraph convolution is proposed to extract discriminative shape representations and fuse the multimodal information. During the process of network training, the fusion network is jointly constrained by the mean square error loss function and the cross entropy loss function. The proposed network is applied to the 3D shape retrieval task, and the experimental results demonstrate that the proposed method can greatly improve the retrieval accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call