Improving Air Quality Data Reliability through Bi-Directional Univariate Imputation with the Random Forest Algorithm

Filip Arnaut,Vladimir A Srećković,Sreten Jevremović,Aleksandra Kolarski,Vladimir Đurđević

doi:10.3390/su16177629

Abstract

Forecasting the future levels of air pollution provides valuable information that holds importance for the general public, vulnerable populations, and policymakers. High-quality data are essential for precise and reliable forecasts and investigations of air pollution. Missing observations arise when the sensors utilized for assessing air quality parameters experience malfunctions, which result in erroneous measurements or gaps in the dataset and hinder the data quality. This research paper presents a novel approach for imputing missing values in air quality data in a univariate approach. The algorithm employs the random forest (RF) algorithm to impute missing observations in a bi-directional (forward and reverse in time) manner for air quality (particulate matter less than 2.5 μm (PM2.5)) data from the Republic of Serbia. The algorithm was evaluated against simple methods, such as the mean and median imputation methods, for missing observations over durations of 24, 48, and 72 h. The results indicate that our algorithm yielded comparable error rates to the median imputation method for all periods when imputing the PM2.5 data. Ultimately, the algorithm’s higher computational complexity proved itself as not justified considering the minimal error decrease it achieved compared with the simpler methods. However, for future improvement, additional research is needed, such as utilizing low-code machine learning libraries and time-series forecasting techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Air Quality Data Reliability through Bi-Directional Univariate Imputation with the Random Forest Algorithm

Abstract

Talk to us

Similar Papers

More From: Sustainability

Lead the way for us

Journal: Sustainability	Publication Date: Sep 3, 2024
License type: CC BY 4.0

Similar Papers

Comment on amt-2021-273
-
-
--
04 Dec 2021
Comment on amt-2021-273
-

Comment on amt-2021-273
-
-
--
03 Dec 2021
Comment on amt-2021-273
-

Skin Disease Classification: A Comparative Analysis of K-Nearest Neighbors (KNN) and Random Forest Algorithm
Osim Kumar Pal
-
Osim Kumar PalOsim Kumar Pal
14 Sep 2021
14 Sep 2021

Cooperative Profit Random Forests With Application in Ocean Front Recognition
Jianyuan Sun ... Qin Zhang
IEEE Access | VOL. 5
Jianyuan Sun, et. al.Jianyuan Sun ... Qin Zhang
01 Jan 2017
IEEE Access | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Air Quality Data Reliability through Bi-Directional Univariate Imputation with the Random Forest Algorithm

Abstract

Talk to us

Similar Papers

More From: Sustainability