Abstract

Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase's parameters can be tuned to improve the efficiency of our system.

Highlights

  • Electric power systems are essential to modern society

  • The electric power data processed by Power Dispatching Automation System (PDAS) are typical 4Vs data

  • We propose a big data platform that uses Apache HBase distributed database to store and query electric power data

Read more

Summary

Introduction

Electric power systems are essential to modern society. As a core subsystem, the Power Dispatching Automation System (PDAS) processes runtime information and makes real-time control decisions, which guarantees the safety and substantiality of electric power systems[1]. System[4], Hadoop Distributed File System (HDFS)[5], BigTable[6], and Apache HBase[7] can be used to store large amounts of electric power data Parallel processing technologies, such as MapReduce[8] and Spark[9, 10], make real-time processing of electric power data possible. This big data platform is based on a Hadoop cluster and is deployed in a cloud computing environment that utilizes server visualization technology. As a batch processing system, Hadoop would not work for real-time queries To address this problem, we propose a big data platform that uses Apache HBase distributed database to store and query electric power data.

Apache HBase
Our System Architecture
Return Query Results
Managing Electric Power Data on HBase
System settings
Comparison with existing system
Performance of data storage
Effects of data fragmentation
Effects of data migration
Scalability
Effects of HBase parameters
Clients 20
Related Work
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call