VAE-GAN3D: Leveraging image-based semantics for 3D zero-shot recognition

Md Tahmeed Abdullah,Sejuti Rahman,Shafin Rahman,Md Fokhrul Islam

doi:10.1016/j.imavis.2024.105049

Abstract

The current state of 3D zero-shot recognition falls short in performance when compared to its counterpart in 2D images. A major challenge is the absence of a robust feature extractor for 3D point cloud data. To overcome this challenge, a large dataset needs to be collected, processed, and fed into a deep-learning model capable of generating distinguishable features after training. To this end, we propose VAE-GAN3D, a model that uses a combination of a variational autoencoder (VAE) and a generative adversarial network (GAN) to supplement the small amount of data available for training with synthetic examples of classes. This allows for the underlying data distribution of large and complex data to be learned by the deep VAE network. When combined with a GAN, this network generates synthetic features that exhibit consistency in both seen and unseen classes. Furthermore, we notice that for tasks involving a small domain of classes, the existing text features do not contribute significantly to zero-shot learning. Therefore, we introduce image representation-based semantic features of classes, which improve the performance of zero-shot recognition for 3D objects. We assess the proposed model using three different datasets and present a technique for partitioning the RGB-D object dataset, which contains real-world objects, into seen and unseen classes. Our approach shows promising results in addressing the challenges of 3D zero-shot recognition and presents a novel solution for improving the accuracy of 3D point cloud recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VAE-GAN3D: Leveraging image-based semantics for 3D zero-shot recognition

Abstract

Talk to us

Similar Papers

More From: Image and Vision Computing

Lead the way for us

Journal: Image and Vision Computing	Publication Date: Apr 26, 2024
Citations: 1

Similar Papers

Enhancing Zero-Shot Action Recognition in Videos by Combining GANs with Text and Images
Kaiqiang Huang ... Susan Mckeever
SN Computer Science | VOL. 4
Kaiqiang Huang, et. al.Kaiqiang Huang ... Susan Mckeever
05 May 2023
SN Computer Science | VOL. 4

Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network
Xing Xu ... Huimin Lu
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 17
Xing Xu, et. al.Xing Xu ... Huimin Lu
31 Jan 2021
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 17

GAN-MVAE: A discriminative latent feature generation framework for generalized zero-shot learning
Peirong Ma ... Hong Lu
Pattern Recognition Letters | VOL. 155
Peirong Ma, et. al.Peirong Ma ... Hong Lu
01 Mar 2022
Pattern Recognition Letters | VOL. 155

3D Point Cloud Generation Using Adversarial Training for Large-Scale Outdoor Scene
Takayuki Shinohara ... Haoyi Xiu
-
Takayuki Shinohara, et. al.Takayuki Shinohara ... Haoyi Xiu
11 Jul 2021
11 Jul 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VAE-GAN3D: Leveraging image-based semantics for 3D zero-shot recognition

Abstract

Talk to us

Similar Papers

More From: Image and Vision Computing