In geophysics, hydrocarbon prospect risking involves assessing the risks associated with hydrocarbon exploration by integrating data from various sources. Machine learning-based classifiers trained on tabular data have been recently used to make faster decisions on these prospects. The lack of transparency in the decision-making processes of such models has led to the emergence of explainable AI. Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) are two such examples of these explainability methods which aim to generate insights about a particular decision by ranking the input features in terms of importance. However, results of the same scenario generated by these two different explanation approaches have been shown to disagree or diverge, particularly for complex data. This discrepancy arises because the concepts of “importance” and “relevance” are defined differently across these approaches. Thus, grounding these ranked features using theoretically backed causal notions of necessity and sufficiency can serve as a more reliable and robust way to enhance the trustworthiness of these methodologies. We propose a unified framework to generate counterfactuals, quantify necessity and sufficiency, and use these measures to perform a robustness evaluation of the insights provided by LIME and SHAP on high-dimensional structured prospect risking data. This robustness test yields deeper insights into the models capabilities to handle erroneous data and reveals which explainability module pairs most effectively with which model for our dataset for hydrocarbon indicators.
Read full abstract