Abstract
Class imbalance and data incompleteness problems occur simultaneously in many real-world classification datasets, which negatively affects the training of classifiers. Given an imbalanced and incomplete training dataset, the conventional approach is to address these two problems sequentially by handling data incompleteness first and then focusing on class imbalance. In this study, we propose a multiple imputation-based minority oversampling technique, named MI-MOTE, to address imbalanced and incomplete data classification simultaneously. Majority instances are imputed once and minority instances are oversampled using multiple different imputations without directly manipulating any of their observed values. Accordingly, minority instances are diversified with less data distortion compared to the conventional approach. The proposed method is applied in the data preprocessing phase, meaning it can be used with any type of classifier. Experimental results for benchmark datasets with various missing rates demonstrate the effectiveness of the proposed method.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have