Abstract

In today's world enormous data is generated continuously due to tremendous growth in industry. Many applications namely, traffic surveillance, energy management, health care and target tracking, industrial machineries, medical equipment's, mobiles and web applications use cameras and sensors which produce massive streaming data. Data mining on this data can lead to very valuable patterns and information that can prove extremely beneficial to that particular industry. Phasor Measurement Unit (PMU) is a sensor installed in smart electrical grid which produce huge streaming data. The remote centers depend on this huge data in decision making process. But data stream classification provides a lot of challenges compared to traditional data mining like concept drift, huge volume of data and high velocity data. This paper has focused to classify anomalies in streaming PMU data by considering all the features using supervised and clustering ensemble techniques. Here we are using random forest algorithm which is a supervised ensemble-based algorithm. In this algorithm, there are individual decision trees, each of which produces a single output. The average or mode of all the outputs of all the individual trees is then declared as the final output for the particular data. A novel framework is proposed for detecting and classifying the anomalies in the streaming PMU dataset using the clustering ensemble algorithm. The benefit of the proposed work is the labelling of streaming PMU data in addition to detection of anomalies. The ensemble members are created using the clustering methods namely K means, hierarchical method and Gaussian Mixture Model (GMM). The consensus functions used to integrate the result of individual base clustering method are Cluster-based Similarity Partitioning Algorithm, Hyper Graph Partitioning Algorithm and Meta-Clustering Algorithm. The clustering model is evaluated using adjusted rand index. Experiments are carried out in Python platform and promising results are obtained which demonstrates the effectiveness of the proposed method for classifying anomalies in real time PMU data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call