Gun-related crime continues to be an urgent public health and safety problem in cities across the US. A key question is: how are firearms diverted from the legal retail market into the hands of gun offenders? With close to 8 million legal firearm transaction records in California (2010–2020) linked to over 380,000 records of recovered crime guns (2010–2021), we employ supervised machine learning to predict which firearms are used in crimes shortly after purchase. Specifically, using random forest (RF) with stratified under-sampling, we predict any crime gun recovery within a year (0.2% of transactions) and violent crime gun recovery within a year (0.03% of transactions). We also identify the purchaser, firearm, and dealer characteristics most predictive of this short time-to-crime gun recovery using SHapley Additive exPlanations and mean decrease in accuracy variable importance measures. Overall, our models show good discrimination, and we are able to identify firearms at extreme risk for diversion into criminal hands. The test set AUC is 0.85 for both models. For the model predicting any recovery, a default threshold of 0.50 results in a sensitivity of 0.63 and a specificity of 0.88. Among transactions identified as extremely risky, e.g., transactions with a score of 0.98 and above, 74% (35/47 in the test data) are recovered within a year. The most important predictive features include purchaser age and caliber size. This study suggests the potential utility of transaction records combined with machine learning to identify firearms at the highest risk for diversion and criminal use soon after purchase.
Read full abstract