Abstract

The detection of concept drift allows to point out when a data stream changes its behaviour over time, what supports further analysis to understand why the phenomenon represented by such data has changed. Nowadays, researchers have been approaching concept drift using unsupervised learning strategies, due to data streams are open-ended sequences of data which are extremely hard to label. Those approaches usually compute divergences of consecutive models obtained over time. However, those strategies tend to be imprecise as models are obtained by clustering algorithms that do not hold any stability property. By holding a stability property, clustering algorithms would guarantee that a change in clustering models correspond to actual changes in input data. This drawback motivated this work which proposes a new approach to model data streams by using a stable hierarchical clustering algorithm. Our approach also considers a data stream composed of a mixture of time-dependent and independent observations. Experiments were conducted using synthetic data streams under different behaviors. Results confirm this new approach is capable of detecting concept drift in data streams.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.