Abstract

Smart meters read the consumption at different time resolutions and may generate large volumes of time series that require special tools for consumption monitoring and fraud detection. Usually, the readings of the smart meter have numerous null values and outliers that impact the results of fraud detection. Feature extraction from time series is a challenge especially when patterns and irregularities have to be identified. Therefore, we propose to implement ten Machine Learning (ML) supervised algorithms with the very recent Python library - TSFEL that stands for Time Series Feature Extraction Library and automatically extracts time series and over 60 features from statistical (such as: mean absolute deviation, variance, interquartile range), temporal (such as: autocorrelation, mean absolute differences, entropy, peak to peak distance) and spectral (such as: FFT mean coefficient, wavelet absolute mean, standard deviation, spectral distance, fundamental frequency) perspective. Two algorithms, Multi-Layer Perceptron and Light Gradient Boost, provide very good results in identifying suspicious consumers on a real consumption dataset recorded in China by the utility company State Grid Corporation of China. The performance of TSFEL and ML algorithms is compared with the case without feature engineering. A data processing methodology is proposed for data processing including several significant stages before training the model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call