A Robust Method to Measure the Global Feature Importance of Complex Prediction Models

Xiaohang Zhang,Ling Wu,Zhengren Li,Huayuan Liu

doi:10.1109/access.2021.3049412

Xiaohang Zhang, Ling Wu + Show 2 more

Open Access

https://doi.org/10.1109/access.2021.3049412

Copy DOI

Abstract

Because machine learning has been widely used in various domains, interpreting internal mechanisms and predictive results of models is crucial for further applications of complex machine learning models. However, the interpretability of complex machine learning models on biased data remains a difficult problem. When the important explanatory features of concerned data are highly influenced by contaminated distributions, particularly in risk-sensitive fields, such as self-driving vehicles and healthcare, it is crucial to provide a robust interpretation of complex models for users. The interpretation of complex models is often associated with analyzing model features by measuring feature importance. Therefore, this article proposes a novel method derived from high-dimensional model representation (HDMR) to measure feature importance. The proposed method can provide robust estimation when the input features follow contaminated distributions. Moreover, the method is model-agnostic, which can enhance its ability to compare different interpretations due to its generalizability. Experimental evaluations on artificial models and machine learning models show that the proposed method is more robust than the traditional method based on HDMR.

Highlights

Machine learning has been widely used in various fields
When estimating variance-based indices, there is a large probability of producing an uncertain feature importance ranking for inputs that are unstable from sample to sample [32], which we address in this study
RiSD is used as an index to represent the Sobol method, which is a variance-based method obtained via analysis of variance (ANOVA)-high-dimensional model representation (HDMR), instead of the first-order effect index Si in (3)

Summary

Introduction

Machine learning has been widely used in various fields. Practical requirements often evaluate machine learning models by their accuracy. The pursuit of predictive accuracy leads to the use of more complex predictive models. Simple and interpretable models often do not have the best performance in terms of predictive accuracy [2]. Complex machine learning models are difficult for humans to understand their internal working mechanisms and decision-making process and are commonly referred to as ‘‘black boxes’’, such as deep neural networks. Such a lack of transparency can increase severe issues and hinder further applications of machine learning. The reason why certain simple models, such as logistic regression and decision tree models, are widely used is partly attributable

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 39	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Robust Method to Measure the Global Feature Importance of Complex Prediction Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

An adaptive high-dimensional stochastic model representation technique for the solution of stochastic partial differential equations
Xiang Ma ... Nicholas Zabaras
Journal of Computational Physics | VOL. 229
Xiang Ma, et. al.Xiang Ma ... Nicholas Zabaras
01 Feb 2010
Journal of Computational Physics | VOL. 229

Hybrid HDMR method with an optimized hybridity parameter in multivariate function representation
Burcu Tunga ... Metin Demiralp
Journal of Mathematical Chemistry | VOL. 50
Burcu Tunga, et. al.Burcu Tunga ... Metin Demiralp
19 May 2012
Journal of Mathematical Chemistry | VOL. 50

Performance Prediction of a Multi-Stage Ammonia-Water Turbine Under Variable Nozzle Operation via Machine Learning
Pan Zhao ... Gang Fan
-
Pan Zhao, et. al.Pan Zhao ... Gang Fan
07 Jun 2021
07 Jun 2021

Evaluating external generalizability of machine learning models for recycled aggregate concrete property prediction
Shreyas Pandurang Jadhav ... Nikhil Bugalia
Journal of Cleaner Production | VOL. 469
Shreyas Pandurang Jadhav, et. al.Shreyas Pandurang Jadhav ... Nikhil Bugalia
15 Jul 2024
Journal of Cleaner Production | VOL. 469

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Robust Method to Measure the Global Feature Importance of Complex Prediction Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access