Abstract

Binary classification is one of the central problems in machine-learning research and, as such, investigations of its general statistical properties are of interest. We studied the ranking statistics of items in binary classification problems and observed that there is a formal and surprising relationship between the probability of a sample belonging to one of the two classes and the Fermi-Dirac distribution determining the probability that a fermion occupies a given single-particle quantum state in a physical system of noninteracting fermions. Using this equivalence, it is possible to compute a calibrated probabilistic output for binary classifiers. We show that the area under the receiver operating characteristics curve (AUC) in a classification problem is related to the temperature of an equivalent physical system. In a similar manner, the optimal decision threshold between the two classes is associated with the chemical potential of an equivalent physical system. Using our framework, we also derive a closed-form expression to calculate the variance for the AUC of a classifier. Finally, we introduce FiDEL (Fermi-Dirac-based ensemble learning), an ensemble learning algorithm that uses the calibrated nature of the classifier's output probability to combine possibly very different classifiers.

Highlights

  • We explore the application of the FD statistics in machine learning in the context of binary classification problems

  • We present a conceptual parallel between certain statistical properties in binary classification problems and the FD

  • The problem of binary classification is a fundamental task in machine learning

Read more

Summary

N0 1 N1

Where sP,i and sN ,k are the scores assigned by the classifier to the ith positive examples and the k th negative examples, respectively, and H(s) is the Heaviside function that takes the value of 1 for positive arguments and 0 for negative arguments. To determine the extent to which class-conditional dependence influences the performance of the FiDEL ensemble we developed a model (SI Appendix) that simulates the situation in which all pairs of classifiers in the ensemble have a conditional rank correlation given both the positive and the negative class equal to a parameter r, which we varied between 0 (uncorrelated case) and 0.6. The WoC ensemble is a classifier whose score for a given item can be computed as the average of the ranks assigned by the base classifiers to that item The results of these simulations, summarized in SI Appendix, Figs. In a previous section we demonstrated that the threshold for segregating the positive and the negative classes that zeroes the log-likelihood ratio is the threshold that maximizes the balance accuracy, and we did so by expressing the balance accuracy in terms of the class-conditioned rank probabilities of the classifier. Given that yr can take only the values 1 and 0, its mean value yr in the (N , N1) ensemble is equal

0.68 Method
Conclusion
Materials and Methods
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.