Abstract
Imbalance of data sets is a widespread problem, and unbalanced data has a great impact on classification results. The traditional data preprocessing methods based on the imbalance of data sets mainly include under sampling and over sampling. Oversampling data preprocessing has the problems of over fitting and fuzzy boundary, under sampling data preprocessing method will discard the useful information of samples. In this paper, a deep learning oversampling model is proposed to solve the problems of the above methods. The model uses the data generation algorithm, the variational auto variable code algorithm, to learn the features of a few samples in the unbalanced data set, and finally combines the newly generated samples and the original data sets to form a new data set. Experimental results show that the accuracy of newly generated data is higher than that of oversampling or under sampling methods. The experimental results show that the variational self-encoding algorithm of the generative model algorithm has better preprocessing results for imbalanced data sets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.