Abstract

This study aims at performance evaluation of Ridge, Elastic Net and Lasso Regression Methods in handling different degrees of multicollinearity in a multiple regression analysis of independent variables using simulation data. The researcher simulated a collection of data with sample size n=200, 1000, 10000, 50000 and 100000, independent variables p=10. The researcher compared the performances of the three methods using Mean Square Errors (MSE). The study found that Elastic Net method outperforms Ridge and Lasso methods to estimate the regression coefficients when a degree of multicollinearity is low, moderate and high for any sample size. While, Lasso method is the most accurate regression coefficients estimator when data containing severe multicollinearity at sample size less than 10000 observations.

Highlights

  • Multiple linear regression is frequently employed is appropriate in particular context to evaluate a model to predict the expected responses, or to explore the link between the dependent variable and the independent variables

  • There is a high number of hypotheses about the model in the regression analysis, specially, the most important one is, in addition to

  • Where Y n 1 is the dependent vector variable, X n p symbolizes the independent variables, p 1 is the set of regression coefficients that needs to be estimated, and p 1 symbolizes the residuals

Read more

Summary

Introduction

Multiple linear regression is frequently employed is appropriate in particular context to evaluate a model to predict the expected responses, or to explore the link between the dependent variable and the independent variables. The first goal, which is the design's prediction accuracy, is critical; the second goal, which is the model's complexity, is more important. Common linear regression procedures are popular for generally not carrying out well according to both prediction performance and model involvement (Doreswamy and Vastrad, 2013). There is a high number of hypotheses about the model in the regression analysis, specially, the most important one is (multicollinearity), in addition to (non-homogeneity of variance, autocorrelation and linearity). If one or more assumptions are broken, the model becomes unreliable, Shady I. Altelbany and it is no longer suitable for estimating population parameters (Herawati et al, 2018)

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.