Abstract

AbstractCost‐sensitive ensemble learning as a combination of two approaches, ensemble learning and cost‐sensitive learning, enables generation of cost‐sensitive tree‐based ensemble models using the cost‐sensitive decision tree (CSDT) learning algorithm. In general, tree‐based models characterize nice graphical representation that can explain a model's decision‐making process. However, the depth of the tree and the number of base models in the ensemble can be a limiting factor in comprehending the model's decision for each sample. The CSDT models are widely used in finance (e.g., credit scoring and fraud detection) but lack effective explanation methods. We previously addressed this gap with cost‐sensitive tree Shapley Additive Explanation Method (CSTreeSHAP), a cost‐sensitive tree explanation method for the single‐tree CSDT model. Here, we extend the introduced methodology to cost‐sensitive ensemble models, particularly cost‐sensitive random forest models. The paper details the theoretical foundation and implementation details of CSTreeSHAP for both single CSDT and ensemble models. The usefulness of the proposed method is demonstrated by providing explanations for single and ensemble CSDT models trained on well‐known benchmark credit scoring datasets. Finally, we apply our methodology and analyze the stability of explanations for those models compared to the cost‐insensitive tree‐based models. Our analysis reveals statistically significant differences between SHAP values despite seemingly similar global feature importance plots of the models. This highlights the value of our methodology as a comprehensive tool for explaining CSDT models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.