Untangling the contribution of input parameters to an artificial intelligence PM2.5 forecast model using the layer-wise relevance propagation method

Dasol Kim,Chang-Hoi Ho,Ingyu Park,Jinwon Kim,Lim-Seok Chang,Min-Hyeok Choi

doi:10.1016/j.atmosenv.2022.119034

Abstract

The recurrent neural network (RNN), an artificial intelligence algorithm, applied to the predictions based on the Community Multiscale Air Quality operational model has significantly improved the forecast accuracy of the concentrations of particulate matter with a diameter of ≤2.5 μm (PM2.5) in the Seoul metropolitan area of the Republic of Korea. It is challenging to interpret the prediction results and identify the related error sources because the decision-making process of the RNN model is inaccessible. This study evaluated the relevance score of the RNN input variables using the layer-wise relevance propagation (LRP) at 6-hourly forecasts over the winters of 2015–2021 (December through February). The relevance score magnitudes summed over the period from the target prediction time to 2–5 and 4–7 time-steps before it (i.e., the most recent 12–30 h and 24–42 h, respectively) show ∼80% of the total relevance score for one- and two-day forecasts, respectively. The input variables were originally selected by their correlation coefficients with the observed PM2.5 concentration; however, the order of input variable contributions measured by the LRP differs from the order of the correlation coefficients, implying inconsistency between the linear and nonlinear methods. Retraining the RNN model using a subset of variables of high relevance scores is found to yield prediction skills comparable to the original set of input variables. This study can contribute to the improvement of the RNN prediction model by decoding the black box of an artificial intelligence model using the LRP method.

Full Text