End-to-End Visual Domain Adaptation Network for Cross-Domain 3D CPS Data Retrieval

An-An Liu,Wei-Zhi Nie,Dan Song,Shu Xiang

doi:10.1109/access.2019.2937377

An-An Liu, Wei-Zhi Nie + Show 2 more

Open Access

https://doi.org/10.1109/access.2019.2937377

Copy DOI

Abstract

3D CPS (Cyber Physical System) data has been widely generated and utilized for multiple applications, e.g. autonomous driving, unmanned aerial vehicle and so on. For large-scale 3D CPS data analysis, 3D object retrieval plays a significant role for urban perception. In this paper, we propose an end-to-end domain adaptation framework for cross-domain 3D objects retrieval (C3DOR-Net), which learns a joint embedding space for 3D objects from different domains in an end-to-end manner. Specifically, we focus on the unsupervised case when 3D objects in the target domain are unlabeled. To better encode a 3D object, the proposed method learns multi-view visual features in a data-driven manner for 3D object representation. Then, the domain adaptation strategy is implemented to benefit both domain alignment and final classification. Specifically, an center-based discriminative feature learning method enables the domain invariant features with better intra-class compactness and inter-class separability. C3DOR-Net can achieve remarkable retrieval performances by maximizing the inter-class divergence and minimizing the intra-class divergence. We evaluate our method on two cross-domain protocols: 1) CAD-to-CAD object retrieval on two popular 3D datasets (NTU and PSB) in three designed cross-domain scenarios; 2) SHREC’19 monocular image based 3D object retrieval. Experimental results demonstrate that our method can significantly boost the cross-domain retrieval performances.

Highlights

Ubiquitous sensing technologies have produced a variety of CPS (Cyber Physical System) data
Different from prior works, our method focuses on crossdomain 3D object retrieval which aims to search the relevant candidates from the target datasets, given a query 3D object from the source datasets
First tier (FT): the recall for the first K relevant samples, where K is the cardinality of the target category

Summary

Introduction

Ubiquitous sensing technologies have produced a variety of CPS (Cyber Physical System) data. Among multi-modal CPS data, 3D data plays a significant role for urbun perception. Advanced 3D object retrieval methods is critical for CPS data analysis. Publicly available 3D object datasets (e.g., 3D Warehouse and Thingiverse) provide wide applications for users by convenient downloading and searching. Confronting with the huge number of 3D models and multiple modalities for 3D data (e.g., RGB-D objects and CAD Shapes), cross-domain 3D object retrieval is becoming mandatory and indispensable [1]–[3]. Most prior works mainly focus on 3D object classification and retrieval in the identical dataset. The mainstream methods for 3D object classification and retrieval can be classified into two categories: model-based and view-based. Model-based methods directly implement 3D object data as

Objectives

Methods

Results

Conclusion