Abstract

In this article, the performance of classification methods was empirically compared while varying the number of classes of dependent variables, the number of independent variables, the types of independent variables, the number of classes of the independent variables, and the sample size. Our study employed 324 simulated examples, with artificial neural networks and decision trees as the data mining techniques, and logistic regression as the statistical method. In the performance study, we use the misclassification errors as the metric and come up with some additional findings: (i) for continuous independent variables, a statistical technique (i.e., logistic regression) was superior to data mining techniques (i.e., artificial neural network and decision tree) when dependent variable has binary values, while the artificial neural network was best when the number of classes of dependent variable was three or more; (ii) for continuous and categorical independent variables, logistic regression performs better than artificial neural network and decision tree in the case of small number of independent variables and small sample size, while artificial neural network was best in other cases; and (iii) the artificial neural network performance improved faster than that of other methods as the number of independent variables and the number of classes of dependent variables increases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.