Abstract
Research on the pandemic situation of COVID-19 is very important for delivering detailed risk analyzes based on estimating the peak of the pandemic. The machine learning approach has a major role to play in predicting the number of COVID-19 cases. Most research on COVID-19 uses polynomial regression for analysis. When a regression model is build, often, the model fails to generalize on unseen data. For instance, the model might end up becoming too complex, having significantly high variance due to over-fitting, thereby impacting the model performance on new data sets. To avoid over-fitting of the polynomial regression, a regularization method can be used to suppress the coefficients of the higher order polynomial, a principle that allows the smoothness of the regression function. The aim of this paper is to formulate a mathematical model for regularization coefficient in polynomial regression and evaluate this approach to enable obtaining meaningful results on a COVID-19 data set. Therefore we believe that our results will contribute to a better understanding of the over-fitting process in polynomial regression. Our methodology consists of following major steps: i) optimizing the model using k-fold cross-validation for finding an optimal regularization coefficient and ii) comparing the performance of ridge regression and lasso regression using accuracy metrics. Moreover, our approach could also have a potential impact in machine learning education, regarding the understanding of the underlying mathematical machinery behind polynomial regression algorithms. The obtained results show that the polynomial model built using lasso regression, outperforms the ridge regression.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.