Abstract

This paper addresses the complex problem of learning from unbalanced datasets due to which traditional algorithms may perform poorly. Classification algorithms used for learning tend to favor the larger, less important classes in such problems. In this work, to handle unbalanced data problem, we synthesize data using variational autoencoders (VAE) on raw training samples and then, use various input sources (raw, combination of raw and synthetic) to train different models. We evaluate our method using multiple criteria on SVHN dataset which consists of complex images, and perform a comprehensive comparative analysis of popular CNN architectures when there is balanced and unbalanced data and determine which operates best in class imbalance problem. We found that data synthesis via VAE is reliable and robust, and can help to classify real data with higher accuracy than traditional (unbalanced) data. Our results demonstrate the strength of using VAE to solve the class imbalance problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call