Short-term load forecasting with clustering–regression model in distributed cluster

Jingsheng Lei,Jiawei Hao,Fengyong Li,Ting Jin

doi:10.1007/s10586-017-1198-4

Abstract

This paper tackles a new challenge in power big data: how to improve the precision of short-term load forecasting with large-scale data set. The proposed load forecasting method is based on Spark platform and “clustering–regression” model, which is implemented by Apache Spark machine learning library (MLlib). Proposed scheme firstly clustering the users with different electrical attributes and then obtains the “load characteristic curve of each cluster”, which represents the features of various types of users and is considered as the properties of a regional total load. Furthermore, the “clustering–regression” model is used to forecast the power load of the certain region. Extensive experiments show that the proposed scheme can predict reasonably the short-term power load and has excellent robustness. Comparing with the single-alone model, the proposed method has a higher efficiency in dealing with large-scale data set and can be effectively applied to the power load forecasting.

Full Text