Abstract
Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase's parameters can be tuned to improve the efficiency of our system.
Highlights
Electric power systems are essential to modern society
The electric power data processed by Power Dispatching Automation System (PDAS) are typical 4Vs data
We propose a big data platform that uses Apache HBase distributed database to store and query electric power data
Summary
Electric power systems are essential to modern society. As a core subsystem, the Power Dispatching Automation System (PDAS) processes runtime information and makes real-time control decisions, which guarantees the safety and substantiality of electric power systems[1]. System[4], Hadoop Distributed File System (HDFS)[5], BigTable[6], and Apache HBase[7] can be used to store large amounts of electric power data Parallel processing technologies, such as MapReduce[8] and Spark[9, 10], make real-time processing of electric power data possible. This big data platform is based on a Hadoop cluster and is deployed in a cloud computing environment that utilizes server visualization technology. As a batch processing system, Hadoop would not work for real-time queries To address this problem, we propose a big data platform that uses Apache HBase distributed database to store and query electric power data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have