In a smart grid distribution management system, operation, planning, forecasting and decision making relies on demand-side management functions, which require real-time smart grid data. This data has significant dollar value because it is extremely useful for efficient control and intelligent prediction of the energy consumption, and expert management of residential and commercial load. However, the huge amount of (smart grid) data generated at a very high velocity poses a number of challenges. Utility companies have a huge demand for efficient summarization techniques to mine interesting patterns and extracting useful and actionable intelligence. Research from various domains has shown that data summarization can significantly improve the scalability and efficiency of various data analytic tasks (e.g., transactional database mining, data streams mining, network monitoring). This paper proposes a summarization approach (i.e., a set of algorithms, data structures, and query mechanisms) that enables the utility company to accurately infer various energy consumption patterns in real-time by automatic monitoring of smart grid data using significantly less computational resources. The proposed summarization approach is suitable for processing spatiotemporal streams, and it can also provide answers in real-time to various smart grid applications (e.g., demand-side management, direct load control, smart pricing and Volt-VAr control). Both theoretical bound and experimental evaluation are presented in this paper, which shows that the memory required for the proposed data structure grows linearly for the first 52 weeks; but interestingly, after the first year, the memory growth is negligible. The experimental results show that the proposed approach can process around 4 million smart meter readings every second or 120 million readings every minute. The proposed approach outperforms widely commercially used Database Management Systems (DBMSs) in terms of update and query costs: it is about 200 times faster than DBMSs in terms of update time, and about 340 times faster than DBMSs in terms of query time.
Read full abstract