Abstract

In this paper we investigate risk prediction of criminal re-offense among juvenile defendants using general-purpose machine learning (ML) algorithms. We show that in our dataset, containing hundreds of cases, ML models achieve better predictive power than a structured professional risk assessment tool, the Structured Assessment of Violence Risk in Youth (SAVRY), at the expense of not satisfying relevant group fairness metrics that SAVRY does satisfy. We explore in more detail two possible causes of this algorithmic bias that are related to biases in the data with respect to two protected groups, foreigners and women. In particular, we look at (1) the differences in the prevalence of re-offense between protected groups and (2) the influence of protected group or correlated features in the prediction. Our experiments show that both can lead to disparity between groups on the considered group fairness metrics. We observe that methods to mitigate the influence of either cause do not guarantee fair outcomes. An analysis of feature importance using LIME, a machine learning interpretability method, shows that some mitigation methods can shift the set of features that ML techniques rely on away from demographics and criminal history which are highly correlated with sensitive features.

Highlights

  • In recent years there is an increasing use of Machine Learning (ML) to assist decision making in areas of high societal relevance such as criminal justice (Berk et al 2017; Goel et al 2018)

  • We look at two particular issues that occur within the training data, and that are potentially problematic for specific group fairness metrics: unequal base rates and the use of input features strongly correlated with the protected features

  • Note that the performance of off-the-shelf ML methods on this dataset is similar to the recidivism prediction on other datasets: 0.67 for a 5-variables random forest classifier (Green and Chen 2019), 0.68–0.71 for COMPAS (Northpoint, Inc. 2012), 0.65–0.66 for the Public Safety Assessment (DeMichele et al 2018), 0.57–0.74 in a meta-study of various risk assessment used in the US (Desmarais et al 2016)

Read more

Summary

Introduction

In recent years there is an increasing use of Machine Learning (ML) to assist decision making in areas of high societal relevance such as criminal justice (Berk et al 2017; Goel et al 2018). ML models are able to learn rules from large datasets and may improve decision processes by being more accurate and avoiding human cognitive biases (Langley and Simon 1995; Kleinberg et al 2017). The European Convention on Human Rights (Article 14) forbids discrimination by “sex, race, colour, language, religion, political or other opinion, national or social origin, association with a national minority, property, birth or other status.”. It does not ensure that the algorithm is “fair.” the literature mentions at least 21 definitions of fairness (see, e.g., Berk et al 2017; Narayanan 2018; Tolan 2018 for an overview on different definitions of algorithmic fairness), and proves some group fairness criteria are incompatible with each other (Chouldechova 2017; Kleinberg et al 2016).

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call