Estimation Of Conditional Expectation Research Articles

Shapley values originated in cooperative game theory but are extensively used today as a model-agnostic explanation framework to explain predictions made by complex machine learning models in the industry and academia. There are several algorithmic approaches for computing different versions of Shapley value explanations. Here, we consider Shapley values incorporating feature dependencies, referred to as conditional Shapley values, for predictive models fitted to tabular data. Estimating precise conditional Shapley values is difficult as they require the estimation of non-trivial conditional expectations. In this article, we develop new methods, extend earlier proposed approaches, and systematize the new refined and existing methods into different method classes for comparison and evaluation. The method classes use either Monte Carlo integration or regression to model the conditional expectations. We conduct extensive simulation studies to evaluate how precisely the different method classes estimate the conditional expectations, and thereby the conditional Shapley values, for different setups. We also apply the methods to several real-world data experiments and provide recommendations for when to use the different method classes and approaches. Roughly speaking, we recommend using parametric methods when we can specify the data distribution almost correctly, as they generally produce the most accurate Shapley value explanations. When the distribution is unknown, both generative methods and regression models with a similar form as the underlying predictive model are good and stable options. Regression-based methods are often slow to train but quickly produce the Shapley value explanations once trained. The vice versa is true for Monte Carlo-based methods, making the different methods appropriate in different practical situations.

Read full abstract

Expected value of information methods evaluate the potential health benefits that can be obtained from conducting new research to reduce uncertainty in the parameters of a cost-effectiveness analysis model, hence reducing decision uncertainty. Expected value of partial perfect information (EVPPI) provides an upper limit to the health gains that can be obtained from conducting a new study on a subset of parameters in the cost-effectiveness analysis and can therefore be used as a sensitivity analysis to identify parameters that most contribute to decision uncertainty and to help guide decisions around which types of study are of most value to prioritize for funding. A common general approach is to use nested Monte Carlo simulation to obtain an estimate of EVPPI. This approach is computationally intensive, can lead to significant sampling bias if an inadequate number of inner samples are obtained, and incorrect results can be obtained if correlations between parameters are not dealt with appropriately. In this article, we set out a range of methods for estimating EVPPI that avoid the need for nested simulation: reparameterization of the net benefit function, Taylor series approximations, and restricted cubic spline estimation of conditional expectations. For each method, we set out the generalized functional form that net benefit must take for the method to be valid. By specifying this functional form, our methods are able to focus on components of the model in which approximation is required, avoiding the complexities involved in developing statistical approximations for the model as a whole. Our methods also allow for any correlations that might exist between model parameters. We illustrate the methods using an example of fluid resuscitation in African children with severe malaria.

Read full abstract

Estimation Of Conditional Expectation Research Articles

Articles published on Estimation Of Conditional Expectation

Stone's theorem for distributional regression in Wasserstein distance

A comparative study of methods for estimating model-agnostic Shapley value explanations

A nonstationary bivariate design flood estimation approach coupled with the most likely and expectation combination strategies

Probabilistic threshold analysis by pairwise stochastic approximation for decision-making under uncertainty

Robust Estimator of Conditional Tail Expectation of Pareto-Type Distribution

Consistent regression using data-dependent coverings

Parametric g-formula implementations for causal survival analyses.

Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction

Conditional expectation estimation through attributable components

B-spline techniques for volatility modeling

Convergence Analysis of Random Generators in Monte Carlo Simulation: Mersenne Twister and Sobol

Nonparametric tests of conditional treatment effects with an application to single-sex schooling on academic achievements

Estimation of a Semiparametric Natural Direct Effect Model Incorporating Baseline Covariates.

Strategies for Efficient Computation of the Expected Value of Partial Perfect Information

Fast Convergence of Regress-Later Estimates in Least Squares Monte Carlo

On data-based optimal stopping under stationarity and ergodicity

The consequences of measurement error when estimating the impact of obesity on income

Nonlinear Filtering of Stochastic Navier-Stokes Equation with Itô-Lévy Noise

Reducing variance in the numerical solution of BSDEs

Distributed Functional Scalar Quantization Simplified

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Estimation Of Conditional Expectation Research Articles

Articles published on Estimation Of Conditional Expectation

Stone's theorem for distributional regression in Wasserstein distance

A comparative study of methods for estimating model-agnostic Shapley value explanations

A nonstationary bivariate design flood estimation approach coupled with the most likely and expectation combination strategies

Probabilistic threshold analysis by pairwise stochastic approximation for decision-making under uncertainty

Robust Estimator of Conditional Tail Expectation of Pareto-Type Distribution

Consistent regression using data-dependent coverings

Parametric g-formula implementations for causal survival analyses.

Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction

Conditional expectation estimation through attributable components

B-spline techniques for volatility modeling

Convergence Analysis of Random Generators in Monte Carlo Simulation: Mersenne Twister and Sobol

Nonparametric tests of conditional treatment effects with an application to single-sex schooling on academic achievements

Estimation of a Semiparametric Natural Direct Effect Model Incorporating Baseline Covariates.

Strategies for Efficient Computation of the Expected Value of Partial Perfect Information

Fast Convergence of Regress-Later Estimates in Least Squares Monte Carlo

On data-based optimal stopping under stationarity and ergodicity

The consequences of measurement error when estimating the impact of obesity on income

Nonlinear Filtering of Stochastic Navier-Stokes Equation with Itô-Lévy Noise

Reducing variance in the numerical solution of BSDEs

Distributed Functional Scalar Quantization Simplified