Abstract

Quality control can effectively improve the quality of surface meteorological observations. To ensure the stability and effectiveness of a quality control model under different terrain and climate conditions, it is necessary to structure a quality control model with strong generalization ability. Algorithms such as the Random Forest provide such generalization ability. However, machine learning algorithms are slower than traditional mathematical models. Therefore, a Random Forest quality control algorithm based on the Principal Component Analysis (PCA-RF) is proposed in this paper. Fifteen target stations under different climatic and geomorphological conditions were selected and tested using observations collected four times daily at neighboring stations from 2005-2014. The results show that using PCA to analyze the elemental composition and select elements with high correlation factors, as well as applying the Random Forest algorithm, can effectively reduce the run time and keep the accuracy of the model. The training sample dependence, model prediction accuracy and error detection rate of the PCA-RF model are superior to those of the Spatial Regression method. Therefore, the PCA-RF method is a better-quality control model for the spatial quality control of multiple elements of surface air temperature observations.

Highlights

  • The estimates using Principal Component Analysis (PCA)-Random Forest (RF) were evaluated against the measurements at 15 target stations, and selected observations are shown in Figure 1 and Table 1 (You et al, 2017)

  • In the analysis of the detection rates of the PCA-RF and spatial regression test method (SRT) methods, the evaluation indexes (MAE and root means square error (RMSE)) and two types of error were used to compare the performances of PCA-RF and SRT, and observation selection was the same as that utilized in the correlation factor analysis

  • This paper studies a spatial quality control (QC) model of six meteorological elements and temperature observations using the PCA-RF model, and the results are compared to those of the SRT method

Read more

Summary

Introduction

Meteorological observations are important for identifying and understanding variations and changes in regional and global climate (Feng et al, 2004). These observations are essential to a wide range of meteorological applications, such as climate monitoring, weather forecasting and the evaluation of numerical weather prediction (NWP) models (He et al, 2016; Ingleby and Lorenc, 1993). Surface observation stations are affected by the gross error and random error during the data acquisition process. The primary task and purpose of quality control (QC) are to identify the gross error and large random error associated with large numbers of observations (Shi-wei et al, 2009)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call