Abstract

Comparing classifier performances may seem a banal affair but makes a side show in machine learning. Usually the paired t-test is used. It requires that two classifiers were run simultaneously or this was simulated. This is not always possible and then entails creating a superstructure only for that purpose. However, the utility of t-test in the given context is altogether doubted. The literature on alternatives is much involved. This does not measure up to the scale of the issue. In this paper the topics in connection with accuracy calculation are surveyed once more, emphasizing the result variation. The known technique of multifold cross-validation is exemplified. A simplified methodology for comparison of classifier performances is proposed. It is based on the accuracy mean and variance and calculating differences between objects defined in these terms. It is being applied to the naive Bayesian and decision tree classifiers implemented on different platforms. The lazy learning approach, applicable to decision trees in discrete domains, is closely followed with an imposition of how it can be improved. Examples are given from the field of health diagnostics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call