Explaining predictive models using Shapley values and non-parametric vine copulas

Kjersti Aas,Thomas Nagler,Anders Løland,Martin Jullum

doi:10.1515/demo-2021-0103

Kjersti Aas, Thomas Nagler + Show 2 more

Open Access

https://doi.org/10.1515/demo-2021-0103

Copy DOI

Abstract

Abstract In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the previously proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than their competitors.

Highlights

In many applications complex machine learning models like Gradient Boosting Machines, Random Forest and Deep Neural Networks are outperforming traditional regression models
The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent
In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are exible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies

Summary

Introduction

In many applications complex machine learning models like Gradient Boosting Machines, Random Forest and Deep Neural Networks are outperforming traditional regression models. Existing work on explaining complex models may be divided into two main categories; global and local explanations The former try to describe the model as whole, in terms of which variables/features in uenced the general model the most. On the other hand, try to identify how the di erent input variables/features in uenced a speci c prediction/output from the model, and are often referred to as individual prediction explanation methods. Such explanations are useful for complex models which behave rather di erent for di erent feature combinations, meaning that the global explanation is not representative for the local behavior.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Dependence Modeling	Publication Date: Jan 1, 2021
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Explaining predictive models using Shapley values and non-parametric vine copulas

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Dependence Modeling

Lead the way for us

Similar Papers

A comparative study of methods for estimating model-agnostic Shapley value explanations
Lars Henry Berge Olsen ... Kjersti Aas
Data Mining and Knowledge Discovery | VOL. 38
Lars Henry Berge Olsen, et. al.Lars Henry Berge Olsen ... Kjersti Aas
29 Mar 2024
Data Mining and Knowledge Discovery | VOL. 38

Validation of single-step GBLUP genomic predictions from threshold models using the linear regression method: An application in chicken mortality.
Matias Bermann ... Ignacy Misztal
Journal of Animal Breeding and Genetics | VOL. 138
Matias Bermann, et. al.Matias Bermann ... Ignacy Misztal
28 Sep 2020
Journal of Animal Breeding and Genetics | VOL. 138

Prediction based on conditional distributions of vine copulas
Bo Chang ... Harry Joe
Computational Statistics & Data Analysis | VOL. 139
Bo Chang, et. al.Bo Chang ... Harry Joe
20 May 2019
Computational Statistics & Data Analysis | VOL. 139

Efficient and simple prediction explanations with groupShapley: A practical perspective

-

24 Jan 2022
24 Jan 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Explaining predictive models using Shapley values and non-parametric vine copulas

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Dependence Modeling