Abstract

The combination of artificial intelligence with data, computing power, and new algorithms can provide important tools for solving engineering problems, such as machine-tool condition monitoring. However, many of these problems require algorithms that can perform in highly dynamic scenarios where the data streams have extremely high sampling rates from different types of variables. The unsupervised learning algorithm based on Gaussian mixture models called Gaussian-based dynamic probabilistic clustering (GDPC) is one of these tools. However, this algorithm may have major limitations if a large amount of concept drifts associated with transients occurs within the data stream. GDPC becomes unstable under these conditions, so we propose a new algorithm called GDPC+ to increase its robustness. GDPC+ represents an important improvement because we introduce: (a) automatic selection of the number of mixture components based on the Bayesian information criterion (BIC), and (b) concept drift transition stabilization based on Cauchy–Schwarz divergence integrated with the Dickey–Fuller test. Thus, GDPC+ can perform better in highly dynamic scenarios than GDPC in terms of the number of false positives. The behavior of GDPC+ was investigated using random synthetic data streams and in a real data stream-based condition monitoring obtained from a machine-tool that produces engine crankshafts at high speed. We found that the initial temporal window size can be used to adapt the algorithm to different analytical requirements. The clustering results were also investigated by induction of the rules generated by the repeated incremental pruning to produce error reduction (RIPPER) algorithm in order to provide insights from the underlying monitored process and its associated concept drifts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call