Abstract

One of the main goals in machine learning studies is to determine the most significant variables on a specific research problem. Various algorithms have been developed to achieve this goal. Random forest, Cubist, and MARS algorithms are the most common ones among these algorithms. Although classical statistical algorithms have been useful to obtain the importance level of the effective variables on the output in a certain amount, the machine learning algorithms may provide clearer and more precise results. In this study, the estimation results of Random Forest, Cubist, and MARS algorithms have been presented comparatively in terms of some performance criteria like mean squares error, the coefficient of determination, and mean absolute error by using a real data set. The results show that the performances of Random Forest and Cubist are similar amongst themselves but better than MARS. Additionally, the rank of the most important variables varies according to the type of algorithm. The concordance between algorithms is investigated from a statistical perspective and found satisfactory. Consequently, Random Forest, Cubist, and MARS can be considered effective and reasonable algorithms for both estimation performance and variable importance evaluation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.