Abstract

Selecting the best model from a set of candidates for a given set of data is obviously not an easy task. In this paper, we propose a new criterion that takes into account a larger penalty when adding too many coefficients (or estimated parameters) in the model from too small a sample in the presence of too much noise, in addition to minimizing the sum of squares error. We discuss several real applications that illustrate the proposed criterion and compare its results to some existing criteria based on a simulated data set and some real datasets including advertising budget data, newly collected heart blood pressure health data sets and software failure data.

Highlights

  • Model selection has become an important focus in recent years in statistical learning, machine learning, and big data analytics [1,2,3,4]

  • The mean squared error (MSE), root mean squared error (RMSE), R2, Adjusted R2, Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), AICc are among common criteria that have been used to measure model performance and select the best model from a set of potential models

  • The R2 is 0.8972, so 89.72% of the variability is explained by all three criteria such as MSE, AIC, AICc, BIC, RMSE, and adjusted R2

Read more

Summary

Introduction

Model selection has become an important focus in recent years in statistical learning, machine learning, and big data analytics [1,2,3,4]. The mean squared error (MSE), root mean squared error (RMSE), R2 , Adjusted R2 , Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), AICc are among common criteria that have been used to measure model performance and select the best model from a set of potential models. We discuss a new criterion PIC that can be used to select the best model among a set of candidate models. The proposed PIC takes into account a larger penalty from adding too many coefficients in the model when there is too small a sample. We discuss briefly several common existing criteria include AIC, BIC, AICc, R2 , adjusted R2 , MSE, and RMSE. The new PIC takes into account a larger penalty when there are too many coefficients to be estimated from too small a sample in the presence of too much noise

Some Criteria for Model Comparisons
Numerical Examples
Applications
Conclusions
Findings
Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.