Abstract
Many real-world applications involve classification from evolving data streams. However, learning in such environment requires algorithms able to learn and predict from potentially unbounded data that are constantly changing. For this to happen, stream algorithms should restrict the storage to a part of – and/or synopsis information from – the stream using efficient and accurate manners and strategies, such as window models and summarization techniques (e.g., sampling, sketching, dimensionality reduction). In this work, we focus on the k-Nearest Neighbors (kNN) where most of the existing approaches for data streams consider that instances have the same weight from the start to the finish of the processing task.In a streaming data scenario, it is often the case that the most recent elements from the data stream are the more relevant ones. Taking into account that the most recent instances are more relevant, we propose a novel kNN approach that stores instances in a sliding window and weighs them according to their arrival time (i.e position on the window) using an adjusted weight function. The empirical results on comprehensive real and synthetic datasets indicate the effectiveness and efficiency of our proposed approach in comparison with state-of-the-art algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.