Abstract
The operation prediction of wind farms will be accompanied by the need for massive data processing, especially the preprocessing of wind farm meteorological data or numerical weather prediction (NWP). Because NWP data are strongly correlated with wind farm operation, proper processing of NWP data could not only reduce data volume but also improve the correlations of wind farm operation predictions. For this purpose, this paper proposes a data preprocessing algorithm based on t-distributed stochastic neighbor embedding (t-SNE). Firstly, the data collected were normalized to eliminate the influence caused by different dimensions. The t-SNE algorithm is then used to reduce the dimensionality of the NWP data related to wind farm operation. Finally, the wind farm data visualization platform is established. In this paper, 22 index variables in NWP data were taken as objects. The t-SNE method was used to preprocess the NWP historical data of a wind farm, and the results were compared with the results of the principal component analysis (PCA) algorithm. It outperformed PCA in error precision; in addition, t-SNE dimension reduction preprocessing also had a visual effect, which could be applied to big data visualization platforms. A long short-term memory network (LSTM) was used to predict the operation of the wind farm by combining the preprocessed NWP data and the operation data. The simulation results proved that the effect of the preprocessed NWP data based on t-SNE on the wind power prediction was significantly improved.
Highlights
Wind power is becoming one of the most important power sources in the power grid
The acquisition and storage of data is the basis for an in-depth understanding of the operational status of wind turbines as contained in wind power big data; data preprocessing is a prerequisite for data analysis massive data
In order to remove the noise of the numerical weather prediction (NWP) samples and visually reflect the characteristics of wind farm meteorological data in low-dimensional space, the sample set was reduced from 22 dimensional to 2 dimensional space using the t-distributed stochastic neighbor embedding (t-stochastic neighbor embedding (SNE)) algorithm, the confusion was set to 20, and iteration was set to 5000 times
Summary
Wind power is becoming one of the most important power sources in the power grid. China’s accumulated wind power capacity is 188 GW, and the total installed capacity has leapt to first in the world [1]. While the penetration rate of wind power is increasing, it generates a huge amount of data for recording the operational status of wind turbines, and so it needs to be studied using big data technology [2,3]. The key technologies of power big data include the following five parts: data acquisition, data storage, data preprocessing, data analysis, and data visualization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.