Abstract

One target on this thesis is to study and realize a kind of data stream clustering algorithm with quick running rate and high clustering accuracy. In order to reach this, we have done some work as follows. Background and relevant work on data stream mining is discussed. Popular traditional clustering algorithms are summarized and the data stream clustering algorithms are researched. On the basis of these, we propose GD-Stream (Grid-Density based Evolving Stream) algorithm, which is a framework based on grid-density. By modifying the synopsis data structure, This algorithm has the following characteristics. Borrowing the framework from CluStream algorithm, GD-Stream is divided into online layer and offline layer, using density-decaying skill Online layer reads data stream rapidly, and stores relative information by synopsis data structure. With this, offline layer provide accurate clustering. The two layers work together to achieve the balance of accuracy and speed..

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call