Abstract Data analysis has become most widespread field of research and it has extended in almost every field of study. Considering the recent trends and developments in the field of communication and information technology, there is a scope of combining the monitoring of substation equipment with big data analysis technology. That will result in an improved data analysis ability, information sharing and utilization rate of monitoring data. In the proposed work, the authors have introduced the big data analysis and its corresponding application in the monitoring of substations. Basic concepts and the procedures of the typical data analysis for general problems are also discussed. As a main part of the paper, different types of distributed data analysis techniques have been proposed, in which two relational online analysis, namely Hive and Impala and one H Base multidimensional online analysis are important. These data analysis techniques are proposed considering the analysis efficiency, storage performance from the business development requirements point of view of the substation. The result obtained depicts that the proposed model has an advantage in storage overhead and roll-up performance, when compared with the traditional method, although the data loading speed is approximately 1.7–1.9 times of the traditional model. Some experiments are carried out in order to verify the validity of the model.
Read full abstract