Abstract

Accurate air quality index (AQI) forecasting makes a difference to public health, local economic development, and ecological environment. As a typical geographical datum, the spatial autocorrelation (SAC) of the AQI is often ignored, which may violate the assumptions of some models, such as machine learning which requires variables to be independent and identically distributed. Considering the strong SAC of the AQI, this study proposes a novel statistical learning framework integrating SAC variables, feature selection, and support vector regression (SVR) for AQI prediction in which correlation analysis and time series analysis are used to extract the spatial-temporal features. In addition, the historical AQI series of the target site is adjusted by using trigonometric regression to eliminate the non-stationarity. To further improve prediction accuracy, a feature selection method combining reinforcement learning with a heuristic algorithm is adopted. To demonstrate the effectiveness of our proposed framework, we select the AQI data of 34 cities from the Yangtze River Delta, which is one of the most polluted areas in eastern China, and focus on the three largest cities, Nanjing, Hangzhou, and Shanghai. We compared the proposed framework with several baselines, and the experiment illustrates that the forecasting accuracy of the proposed framework is significantly better than the baselines at all selected key sites that can provide accurate predictions for air quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call