Twenty-four-hour cloud cover calculation using  a ground-based imager with machine learning

Bu-Yo Kim,Ki-Ho Chang,Joo Wan Cha

doi:10.5194/amt-14-6695-2021

Twenty-four-hour cloud cover calculation using a ground-based imager with machine learning

Bu-Yo Kim, Ki-Ho Chang + Show 1 more

Open Access

https://doi.org/10.5194/amt-14-6695-2021

Copy DOI

Abstract

Abstract. In this study, image data features and machine learning methods were used to calculate 24 h continuous cloud cover from image data obtained by a camera-based imager on the ground. The image data features were the time (Julian day and hour), solar zenith angle, and statistical characteristics of the red–blue ratio, blue–red difference, and luminance. These features were determined from the red, green, and blue brightness of images subjected to a pre-processing process involving masking removal and distortion correction. The collected image data were divided into training, validation, and test sets and were used to optimize and evaluate the accuracy of each machine learning method. The cloud cover calculated by each machine learning method was verified with human-eye observation data from a manned observatory. Supervised machine learning models suitable for nowcasting, namely, support vector regression, random forest, gradient boosting machine, k-nearest neighbor, artificial neural network, and multiple linear regression methods, were employed and their results were compared. The best learning results were obtained by the support vector regression model, which had an accuracy, recall, and precision of 0.94, 0.70, and 0.76, respectively. Further, bias, root mean square error, and correlation coefficient values of 0.04 tenths, 1.45 tenths, and 0.93, respectively, were obtained for the cloud cover calculated using the test set. When the difference between the calculated and observed cloud cover was allowed to range between 0, 1, and 2 tenths, high agreements of approximately 42 %, 79 %, and 91 %, respectively, were obtained. The proposed system involving a ground-based imager and machine learning methods is expected to be suitable for application as an automated system to replace human-eye observations.

Highlights

In countries, including South Korea, that have not introduced automated systems, ground-based cloud cover observation has been performed using the human eye, in accordance with the normalized synoptic observation rule of the World Meteorological Organization (WMO), and recorded in tenths or oktas (Kim et al, 2016; Yun and Whang, 2018)
The accuracy was in the range of 0.91–0.98 for each cloud cover, whereas recall and precision were in the ranges of 0.42–0.92 and 0.24–0.99, exhibiting low predictive power in the partly cloudy case
Apart from the support vector regression (SVR) and random forest (RF) methods, the machine learning methods exhibited similar frequency distributions; the accuracy, recall, precision, and R were lower and the root mean square error (RMSE) values were higher in the order of gradient boosting machines (GBMs), k-nearest neighbor (kNN), artificial neural networks (ANNs), and multiple linear regression (MLR)

Summary

Introduction

In countries, including South Korea, that have not introduced automated systems, ground-based cloud cover observation has been performed using the human eye, in accordance with the normalized synoptic observation rule of the World Meteorological Organization (WMO), and recorded in tenths or oktas (Kim et al, 2016; Yun and Whang, 2018). Deep learning methods that repeatedly learn data features by sub-sampling image data at each convolution step for gradient descent are available, such as convolutional neural networks (Dev et al, 2019; Shi et al, 2019; Xie et al, 2020) This approach is difficult to utilize for nowcasting because considerable physical resources and time are consumed by the learning and prediction processes (Al Banna et al, 2020; Kim et al, 2021).

Ground-based imager

Cloud cover calculation and validation

Machine learning input data

Machine learning methods

Training and validation results of machine learning methods

Test set results for SVR model

Conclusions