Abstract

As a pervasive issue, missing data may influence the data modeling performance and lead to more difficulties of completing the desired tasks. Many approaches have been developed for missing data imputation. Recently, by taking advantage of the emerging generative adversarial network (GAN), an effective missing data imputation approach termed generative adversarial imputation nets (GAIN) was developed. However, its modeling architecture may still lead to significant imputation bias. In addition, with the GAN structure, the training process of GAIN may be instable and the imputation variation may be high. Hence, to address these two limitations, the ensemble GAIN with selective multi-generator (ESM-GAIN) is proposed to improve the imputation accuracy and robustness. The contributions of the proposed ESM-GAIN consist of two aspects: (1) a selective multi-generation framework is proposed to identify high-quality imputations; (2) an ensemble learning framework is incorporated for GAIN imputation to improve the imputation robustness. The effectiveness of the proposed ESM-GAIN is validated by both numerical simulation and two real-world breast cancer datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call