Abstract
The goal of this study was to compare the performance of decision tree (DT) classifiers with artificial neural network (ANN) and linear discriminant analysis (LDA) classifiers under different conditions for the class distributions, feature space dimensionality, and training sample size using a Monte Carlo simulation study. We also investigated a bagging technique for improving the accuracy of the DT. The resubstitution (training) and test accuracies of the classifiers were compared by using the area Az under the ROC curve as the performance measure. Three types of feature space distributions were studied: the Gaussian feature space, a mixture of Gaussians, and a mixture of uniform distributions. The feature space dimensionality was varied between 2 and 12. In a given experiment, 1000 cases were randomly sampled from each distribution, Nt trainers per class was used for classifier design, and the remaining cases were used to test the classifier. The effect of the training sample size was investigated by varying Nt between 30 and 500. Performance measures from 100 experiments were averaged. Our results indicated that, in the Gaussian feature space, the LDA outperformed the other two classifiers, especially when the number of trainers was low. For the mixture of uniform distributions, the Az value of the DT was in general higher than that of the ANN and the LDA. For the mixture of Gaussians, the performances of the DT and ANN classifiers were comparable. Our results indicate that a DT can be a viable alternative to ANN and LDA classifiers in certain feature spaces.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.