Abstract
With recent advancements in Industrial Internet of Things (IIoT), the stream data generated in IIoT applications presents new characteristics: huge amount of data, ultrahigh computational complexity and large memory consumption, and the existence of concept drift leading to the ineffective distinction between real drift and anomalies. It is difficult for the current mainstream methods to cope with the above problems effectively. In this article, we propose an efficient Gaussian kernel microcluster real-time-clustering method for IIoT data streams (GKMC).The method uses a microcluster sketch structure instead of individual data sample points to participate in clustering directly, which solves the problem of not being able to store unlimited data in limited memory; it uses a Gaussian kernel function to calculate the local density of microclusters to enhance the detection of anomalies; in addition, using the gravity energy function recursively to update the microcluster online and using the relearning strategy to improve the detection ability of whether the outdated microcluster belongs to abnormal microcluster or has real concept drift, ensuring that the current microcluster is always the latest microcluster most closely related to the cluster. The theoretical analysis and sufficient comparison experiments on three data sets show that the proposed algorithm has a better clustering effect than the current mainstream stream clustering algorithms.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have