Abstract

The global air pollution is constantly increasing and causing negative effects on human health such as respiratory, cardiovascular diseases and cancers. Recently, pollution in Hanoi has become increasingly worse, especially when PM2.5 concentration is always at high level. Thus, PM2.5 prediction is of more urgency to issue early forecasts. Depending on air data including meteorological indicators and air pollution indicators collected in Hanoi, we have proposed a new characteristic extraction method that gave better results when uing the same algorithm compared to those of old methods. XGBoost algorithm was applied to predict the concentration of PM2.5 and the test showed that the accuracy of this algorithm is higher than that of other data mining algorithms while the training time is significantly lower.

Highlights

  • Increasing air polution is raising many problems concerning human health

  • It can be seen that our extraction method gives ~ 2% higher results than the old method when we conduct the test on the same model

  • We compared the predictive model with other models: Support Vector Machine (SVM), Random Forest (RF), MLP and XGBoost

Read more

Summary

INTRODUCTION

Increasing air polution is raising many problems concerning human health. According to World Health Organization, air pollution has impact on everbody in all countries. The key factor to this situation is the surge of PM2.5 concentration in the air This dust type has a negative impact on human health, predicting the PM2.5 dust pollution level is increasingly necessary. Meteorological indicators are necessary for the prediction, in addition to other pollution indicators (particulate matter with a diameter of 10 μm - PM10, concentration of CO2, total volatile organic compounds - TVOC) and time factor is considered to influence the predictions. By this extraction method, we make a comparison between the old extraction method and tests on different prediction models: SVM, RF, MLP (Multi-layer Perceptron) and XGBoost (Extreme Gradient Boosting) in part 4. We draw the conclusions and discuss future development in part 5

RELATED WORKS
METHOD
Data description
Feature extraction method
Predictive model description
EXPERIMENTS AND RESULTS
CONCLUSION AND DEVELOPMENT TREND

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.