Abstract

Following the rapid development of various industrial sectors, air pollution frequently occurs in every corner of the world. As a dominant pollutant in Malaysia, particulate matter PM10 can cause highly detrimental effects on human health. This study aims to predict the daily average concentration of PM10 based on the data collected from 60 air quality monitoring stations in Malaysia. Building a forecasting model for each station is time-consuming and unrealistic; therefore, a hybrid model that combines the k-means clustering technique and the long short-term memory (LSTM) model is proposed to reduce the number of models and the overall model training time. Based on the training set, the stations were clustered using the k-means algorithm and an LSTM model was built for each cluster. Then, the prediction performance of the hybrid model was compared with the univariate LSTM model built independently for each station. The results show that the hybrid model has a comparable prediction performance to the univariate LSTM model, as it gives the relative percentage difference (RPD) less than or equal to 50% based on at least two accuracy metrics for 43 stations. The hybrid model can also fit the actual data trend well with a much shorter training time. Hence, the hybrid model is more competitive and suitable for real applications to forecast air quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call