Abstract

In this paper, we propose a new online learning algorithm tailored for data streams described by varying feature spaces (VFS), wherein new features constantly emerge and old features may stop to be observed over various time spans. Our proposed algorithm, named Online Random Feature Forests for Feature space Variabilities (ORF3V), provides a strategy to respect such feature dynamics by generating, updating, pruning, as well as online re-weighing an ensemble of what we call feature forests, which are generated and updated based on a compressed and storage efficient representation for each observed feature. We benchmark our algorithm on 12 datasets, including one novel real-world dataset of government COVID-19 responses collected through a crowd-sensing program in Spain. The empirical results substantiate the viability and effectiveness of our ORF3V algorithm and its superior accuracy performance over the state-of-the-art rival models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.