Abstract

Data streams are collected from the real time operation of complex machines, plants and other technological systems. The collected information could be further used for different purposes, such as performance evaluation, anomaly detection, change detection or fault diagnosis of the operating systems. The analysis of the data streams is done by a calculation tool for estimating the similarity level between pairs of given chunks of data from the data stream, considered as data clouds. One of these data cloud represents a prerecorded (previously known) abnormal operation of the system, while the other data cloud represents a current (still unknown) behavior of the system. Then the similarity analysis will show how close the two data clouds are. In this paper we propose a novel method for similarity analysis that uses two types of models called Data Cloud Model (DCM) and Window Cloud Model (WCM). The DCM is obtained from the data cloud that represents a previously known operation of the system, while the WCM is obtained from the newly collected data cloud from the data stream, called window data cloud. Both data clouds have an equal length (number of data). The algorithms for creating the DCM and WCM are explained in the paper. The DCM consists of Active Grid cells that represent approximately the data density within the known data cloud. The WCM estimates the data density at the same active grid cells, but based on the data from the new window data cloud. Both densities are represented as two Histograms that are compared to each other in order to calculate the similarity level as a value between 0.0 and 1.0. Another problem discussed in the paper is finding a plausible method for detection of significant changes in the data streams. Here the moving window technique is used for collecting series of subsequent data clouds. Then two procedures of moving windows are run in parallel, each of them with different window lengths, in order to calculate the center-of-gravity of the respective data in a real time. The difference between the results of the two moving windows is used as an estimate of the change in the process operation. Both technologies developed in this paper are explain in details in the paper and illustrated on the example of real data stream from a petrochemical plant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call