Abstract

SummaryThis research developed and tested machine learning models to predict significant credit card fraud in corporate systems using Sarbanes‐Oxley (SOX) reports, news reports of breaches and Fama‐French risk factors (FF). Exploratory analysis found that SOX information predicted several types of security breaches, with the strongest performance in predicting credit card fraud. A systematic tuning of hyperparamters for a suite of machine learning models, starting with a random forest, an extremely‐randomized forest, a random grid of gradient boosting machines (GBMs), a random grid of deep neural nets, a fixed grid of general linear models where assembled into two trained stacked ensemble models optimized for F1 performance; an ensemble that contained all the models, and an ensemble containing just the best performing model from each algorithm class. Tuned GBMs performed best under all conditions. Without FF, models yielded an AUC of 99.3% and closeness of the training and validation matrices confirm that the model is robust. The most important predictors were firm specific, as would be expected, since control weaknesses vary at the firm level. Audit firm fees were the most important non‐firm‐specific predictors. Adding FF to the model rendered perfect prediction (100%) in the trained confusion matrix and AUC of 99.8%. The most important predictors of credit card fraud were the FF coefficient for the High book‐to‐market ratio Minus Low factor. The second most influential variable was the year of reporting, and third most important was the Fama‐French 3‐factor model R2 – together these described most of the variance in credit card fraud occurrence. In all cases the four major SOX specific opinions rendered by auditors and the signed SOX report had little predictive influence.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call