Abstract

Feature selection on incomplete datasets is a challenging task. To address this challenge, existing methods first employ imputation methods to complete the dataset and then perform feature selection based on the imputed dataset. Since missing value imputation and feature selection are entirely independent, the importance of features cannot be considered during imputation. However, in real-world scenarios or datasets, different features have varying degrees of importance. To this end, we proposed a novel incomplete data feature selection framework that considers feature importance. The framework mainly consists of two alternating iterative stages: M-stage and W-stage. In the M-stage, missing values are imputed based on a given feature importance vector and multiple initial imputation results. In the W-stage, an improved reliefF algorithm is employed to learn the feature importance vector based on the imputed data. In particular, the feature importance output by the W-stage in the current iteration will be used as the input of the M-stage in the next iteration. Experimental results on artificial and real missing datasets demonstrate that the proposed method outperforms other approaches significantly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.