Since 2005, female firearm suicide rates increased by 34%, outpacing the rise in male firearm suicide rates over the same period. The objective of this study was to develop and evaluate a natural language processing pipeline to identify a select set of common and important circumstances preceding female firearm suicide from coroner/medical examiner and law enforcement narratives. Unstructured information from coroner/medical examiner and law enforcement narratives were manually coded for 1,462 randomly selected cases from the National Violent Death Reporting System. Decedents were included from 40 states and Puerto Rico from 2014 to 2018. Naive Bayes, Random Forest, Support Vector Machine, and Gradient Boosting classifier models were tuned using 5-fold cross-validation. Model performance was assessed using sensitivity, specificity, positive predictive value, F1, and other metrics. Analyses were conducted from February to November 2022. The natural language processing pipeline performed well in identifying recent interpersonal disputes, problems with intimate partners, acute/chronic pain, and intimate partners and immediate family at the scene. For example, the Support Vector Machine model had a mean of 98.1% specificity and 90.5% positive predictive value in classifying a recent interpersonal dispute before suicide. The Gradient Boosting model had a mean of 98.7% specificity and 93.2% positive predictive value in classifying a recent interpersonal dispute before suicide. This study developed a natural language processing pipeline to classify 5 female firearm suicide antecedents using narrative reports from the National Violent Death Reporting System, which may improve the examination of these circumstances. Practitioners and researchers should weigh the efficiency of natural language processing pipeline development against conventional text mining and manual review.
Read full abstract