This research explores the predictive capabilities of XGBoost (XGB) and Random Forest (RF) models for customer upsell responses, emphasizing the use of Explainable Artificial Intelligence (XAI) techniques to gain insights. Initially trained without hyperparameter tuning, both models were later optimized using 5-fold cross-validation. While RF consistently achieved high accuracy (0.99), XGB exhibited lower accuracy (0.85) yet demonstrated superior precision and recall. Post-tuning, XGB maintained its competitive edge despite a slight decrease in ROC-AUC scores (0.76 and 0.75 versus RF's 0.67 and 0.72), indicating proficiency in classifying positive cases. XAI techniques complemented XGB’s prediction, revealing significant predictors such as inactive duration in days, race (Chinese), total communication count, age, and active period in days. Lesser predictive value was attributed to factors such as race (Indian), gender (female), and region (northern). While the feature importance plot provided a broad overview, it did not detail specific attribute relationships to predictions. To address this, a summary violin plot was employed to illustrate how feature importance varies with actual values, enhancing the understanding of each feature's impact. Results indicated that longer inactivity periods negatively influenced predictions, while non-Chinese ethnicity, higher communication frequency, and younger age were associated with positive outcomes. Dependence plots further elucidated these relationships, highlighting how older non-Chinese customers and those with shorter inactive periods and frequent communication were more likely to accept offers. Local explanations using Shapley's force plot and LIME offered deeper insights into specific instances. Overall, the study underscores the complementary use of XAI techniques to understand a model’s predictions.
Read full abstract