Abstract

I examine the construction and evaluation of machine learning (ML) binary classification models. These models are increasingly used for societal applications such as classifying patients into two categories according to the presence or absence of a certain disease like cancer and heart disease. I argue that the construction of ML (binary) classification models involves an optimisation process aiming at the minimization of the inductive risk associated with the intended uses of these models. I also argue that the construction of these models is underdetermined by the available data, and that this makes it necessary for ML modellers to make social value judgments in determining the error costs (associated with misclassifications) used in ML optimization. I thus suggest that the assessment of the inductive risk with respect to the social values of the intended users is an integral part of the construction and evaluation of ML classification models. I also discuss the implications of this conclusion for the philosophical debate concerning inductive risk.

Highlights

  • The societal need to extract useful information from large and complex data sets, often referred to as big data, has led to the emergence of big data analytics

  • Classification accuracy is not an appropriate metric to evaluate the predictive performance of machine learning (ML) classification models with unequal error costs, as it does not take account of unequal costs assigned to FP and FN (Provost et al, 1998)

  • I have argued that the construction of ML classification models illustrates inductive underdetermination of model construction, in the sense that the methodological choices underlying the construction of these models are underdetermined by the training data, which constitutes the sole empirical evidence for ML model construction

Read more

Summary

Introduction

The societal need to extract useful information from large and complex data sets, often referred to as big data, has led to the emergence of big data analytics This is a new field of study encompassing various computational methods that have been offered to cope with the growing complexity of big data analysis. The accuracy of the predictions of ML models depends on how well these models generalize to new data sets beyond those used to construct and test them In this regard, the application of ML models to big data is based on inductive generalization and, as a result, their predictions about new data sets are always prone to error. In the philosophy of science literature, inductive risk has been discussed in relation to the context of theory (or hypothesis or model) acceptance, whereas its relevance to the context of theory (or hypothesis or model) construction has been neglected. I will discuss the implications of this conclusion for the philosophical debate concerning inductive risk

The inductive risk argument and Jeffrey’s counterargument
Inductive risk in the context of model construction
Essential elements and aspects of ML
Supervised ML: an illustrative example
Underdetermination of ML model construction and inductive risk
Cost‐sensitive ML optimisation
Evaluation of ML binary classification models
Algorithmic and epistemic opacity in deep ML
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call