Abstract

Before building machine learning models, the dataset should be prepared to be a high quality dataset, we should give the model the best possible representation of the data. Different attributes may have different scales which possibly will increase the difficulty of the problem that is modeled. A model with varying scale values may suffers from poor performance during learning. Our study explores the usage of Numerical Data Scaling as a data pre-processing step with the purpose of how effectively these methods can be used to improve the accuracy of learning algorithms. In particular, three numerical data Scaling methods with four machine learning classifiers to predict disease severity were compared. The experiments were built on Coronavirus 2 (SARS-CoV-2) datasets which included 1206 patients who were admitted during the period between June 2020 and April 2021. The diagnosis of all cases was confirmed with RT-PCR. Basic demographic data and medical characteristics of all participants was collected. The reported results indicate that all techniques are performing well with Numerical Data Scaling and there are significant improvement in the models for unseen data. lastly, we can conclude that there are increase in the classifier performance while using scaling techniques. However, these methods help the algorithms to better understand learn the patterns in the dataset which help making accurate models

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.