Abstract

Online streaming feature selection has received extensive attention in the past few years. Existing approaches have a common assumption that the feature space of the fixed data instances increases dynamically without any missing entry. This assumption, however, does not always hold in many real-world applications. For example, in a credit evaluation system, we cannot collect the complete dynamic features for each person and/or enterprise. Motivated by this observation, this paper aims at conducting online feature selection from capricious streaming features, where features flow in one by one with some random missing entries while the number of data instances remains fixed. To do so, we propose a general framework named GF-CSF. The main idea of GF-CSF is to adopt latent factor analysis to preprocess capricious streaming features for completing their missing entries before conducting feature selection. Both theoretical and experimental analyses indicate that GF-CSF can efficiently improve any existing model of online streaming features selection to achieve online capricious streaming features selection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call