Background: The Air Pollutant Index (API) in Malaysia is determined by calculating sub-indices for the six main pollutants: particulate matter (PM10 and PM2.5), ozone (O3), carbon monoxide (CO), sulphur dioxide (SO2), and nitrogen dioxide (NO2), based on the possible health implications to the public. The study focuses on the UiTM Shah Alam Internet of Things (IoT) monitoring station in Selangor, which is an urban area. Method: Data was retrieved from low-cost IoT sensors, containing datasets from February 2022 to June 2022. The study aims to develop a predictive model using a Machine-Learning approach to predict air pollutant concentrations for the following day. Results: The comparison of the three models reveals that Random Forest had the best predictive models for PM10 concentration, with root-mean-square error (RMSE) values between 10.88 and 18.15, absolute error values between 8.03 and 11.39, and relative error values between 29.67 and 31.51. The RMSE, absolute error, and relative error for SO2 were (0.26-0.39), (0.11-0.26), and (50.11%- 84.64%), respectively. The absolute error (0.003–0.004), relative error (20.83%–24.52%), and RMSE (0.004–0.005) for NO2 were measured. For CO, the relative error (26.01%-42.34%), absolute error (0.147-0.250), and RMSE (0.259-0.468) were all within allowable bounds. The O3 RMSE, absolute, and relative errors were (0.003–0.005), (0.0005-0.00006), and (26.17%–33.10%), respectively. The results of the concentration prediction for PM2.5 were as follows: RMSE: (16.65 - 26.83), absolute error (10.15 - 14.29) and relative error (31.43% - 33.09%). Conclusion: Based on the results, the study shows that PM2.5 is a significant pollutant, representing the API.
Read full abstract