Abstract

A variety of real-world applications heavily relies on an adequate analysis of transient data streams. Due to the rigid processing requirements of data streams, common analysis techniques as known from data mining are not directly applicable. A fundamental building block of many data mining and analysis approaches is density estimation. It provides a well-defined estimation of a continuous data distribution, a fact, which makes its adaptation to data streams desirable. A convenient method for density estimation utilizes kernels. The computational complexity of kernel density estimation, however, renders its application to data streams impossible. In this paper, we tackle this problem and propose our Cluster Kernel approach which provides continuously computed kernel density estimators over streaming data. Not only do Cluster Kernels meet the rigid processing requirements of data streams, they also allocate only a constant amount of memory, even with the opportunity to adapt it dynamically to changing system resources. For this purpose, we develop an intelligent merge scheme for Cluster Kernels and utilize continuously collected local statistics to resample already processed data. We focus on Cluster Kernels for one-dimensional data streams, but also address the multi-dimensional case. We validate the efficacy of Cluster Kernels for a variety of real-world data streams in an extensive experimental study.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.