Abstract

Air pollution is a growing environmental concern, especially in big cities. The effects of air pollution are harmful to both living beings and the environment. Air quality can be predicted using techniques like probability, statistics but these methods are complex to predict. Machine learning is a better approach to air quality prediction. Air quality forecasting is a crucial step to protect public health by providing an early warning against harmful air pollutants. Prediction of air quality will assist in initiating emergency measures to reduce the discharge of pollutants and mitigate the consequences. It was analysed that imbalance class distribution give inaccurate predictions. As a result, two well-known resampling techniques are used: Synthetic Minority Oversampling Technique (SMOTE) for oversampling and Neighbourhood Cleaning Rule (NCR) for undersampling. The effects of these approaches and their combination (SMOTE+NCR) on prominent machine learning classifiers K-Nearest Neighbours (KNN) and Naive Bayes are compared. The presented results demonstrate that KNN performed better on resampled data using SMOTE+NCR and Naive Bayes performed better on undersampled data using NCR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call