Effect of Feature Scaling Pre-processing Techniques on Machine Learning Algorithms to Predict Particulate Matter Concentration for Gandhinagar, Gujarat, India

Zalak L Thakker,Sanjay H Buch

doi:10.32628/ijsrst52411150

Abstract

Particulate matter (PM) has widely been recognized as the primary factor responsible for air pollution, posing significant health hazards, particularly cardiovascular and respiratory diseases. Major sources of particulate matter include construction sites, power plants, industries and automobiles, landfills and agriculture, wildfires and brush/waste burning, industrial sources, wind-blown dust from open lands, pollen, and fragments of bacteria. Even though various studies have been carried out to predict particulate matter concentration, there are only a handful of papers that focus on the data scaling pre-processing aspect and how it affects the prediction. For the study, Gandhinagar Smart City Development Limited, Gandhinagar, Gujarat has provided Air Quality data from 26-1-2022 to 16-01-2023. The provided data has several challenges such as missing data, inconsistent data, and mixed data (numerical and categorical). Data pre-processing is an essential step in machine learning regression problems. Data pre-processing techniques include missing value handling, data scaling, outlier detection, feature selection/engineering, and imputation. So, this paper aims to identify the effect of the data scaling pre-processing technique to predict the concentration of Particulate Matter (PM10) for Gandhinagar, Gujarat. Data scaling will be performed based on whether data are normally distributed or not. Four data scaling techniques such as Normalizer, Robust Scaler, Min-Max Scaler, and Standard Scaler in combination with six machine learning algorithms such as Multiple Linear Regressor, Support Vector Regressor, K-Nearest Neighbour regressor, Decision Tree Regressor, Random Forest Regressor, and XGBoost Regressor were compared to identify best prediction model for Particulate Matter (PM10) concentration.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Effect of Feature Scaling Pre-processing Techniques on Machine Learning Algorithms to Predict Particulate Matter Concentration for Gandhinagar, Gujarat, India

Abstract

Talk to us

Similar Papers

More From: International Journal of Scientific Research in Science and Technology

Lead the way for us

Journal: International Journal of Scientific Research in Science and Technology	Publication Date: Feb 1, 2024
License type: cc-by

Similar Papers

Measuring the Performance of Supervised Machine Learning Algorithms for Optimizing Wheat Productivity Prediction Models: A Comparative Study
Ashique Ali Chohan ... Rashid Ahmed
Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences | VOL. 60
Ashique Ali Chohan, et. al. Ashique Ali Chohan ... Rashid Ahmed
12 Dec 2023
Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences | VOL. 60

A Pragmatic Ensemble Strategy for Missing Values Imputation in Health Records.
Shivani Batra ... Prakash Srivastava
Entropy (Basel, Switzerland) | VOL. 24
Shivani Batra, et. al.Shivani Batra ... Prakash Srivastava
10 Apr 2022
Entropy (Basel, Switzerland) | VOL. 24

Comparing Different Pre-processing Techniques and Machine Learning Models to Predict PM10 and PM2.5 Concentration in Malaysia
Zainal Ahmad ... Jie Zhang
-
Zainal Ahmad, et. al.Zainal Ahmad ... Jie Zhang
01 Jan 2020
01 Jan 2020

ML algorithms to estimate data reliability metric of ECG from inter-patient data for trustable AI-based cardiac monitors
Mst Moriom R Momota ... Bashir I Morshed
Smart Health | VOL. 26
Mst Moriom R Momota, et. al.Mst Moriom R Momota ... Bashir I Morshed
01 Dec 2022
Smart Health | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Effect of Feature Scaling Pre-processing Techniques on Machine Learning Algorithms to Predict Particulate Matter Concentration for Gandhinagar, Gujarat, India

Abstract

Talk to us

Similar Papers

More From: International Journal of Scientific Research in Science and Technology