Using Natural Language Processing to Predict Fatal Drug Overdose From Autopsy Narrative Text: Algorithm Development and Validation Study.

Leigh Anne Tang,Dimitrios Zaras,Stephen Espy,Allison Roberts,Sutapa Mukhopadhyay,Colin G Walsh,Jessica Korona-Bailey

doi:10.2196/45246

Leigh Anne Tang, Dimitrios Zaras + Show 5 more

Open Access

https://doi.org/10.2196/45246

Copy DOI

Abstract

Fatal drug overdose surveillance informs prevention but is often delayed because of autopsy report processing and death certificate coding. Autopsy reports contain narrative text describing scene evidence and medical history (similar to preliminary death scene investigation reports) and may serve as early data sources for identifying fatal drug overdoses. To facilitate timely fatal overdose reporting, natural language processing was applied to narrative texts from autopsies. This study aimed to develop a natural language processing-based model that predicts the likelihood that an autopsy report narrative describes an accidental or undetermined fatal drug overdose. Autopsy reports of all manners of death (2019-2021) were obtained from the Tennessee Office of the State Chief Medical Examiner. The text was extracted from autopsy reports (PDFs) using optical character recognition. Three common narrative text sections were identified, concatenated, and preprocessed (bag-of-words) using term frequency-inverse document frequency scoring. Logistic regression, support vector machine (SVM), random forest, and gradient boosted tree classifiers were developed and validated. Models were trained and calibrated using autopsies from 2019 to 2020 and tested using those from 2021. Model discrimination was evaluated using the area under the receiver operating characteristic, precision, recall, F1-score, and F2-score (prioritizes recall over precision). Calibration was performed using logistic regression (Platt scaling) and evaluated using the Spiegelhalter z test. Shapley additive explanations values were generated for models compatible with this method. In a post hoc subgroup analysis of the random forest classifier, model discrimination was evaluated by forensic center, race, age, sex, and education level. A total of 17,342 autopsies (n=5934, 34.22% cases) were used for model development and validation. The training set included 10,215 autopsies (n=3342, 32.72% cases), the calibration set included 538 autopsies (n=183, 34.01% cases), and the test set included 6589 autopsies (n=2409, 36.56% cases). The vocabulary set contained 4002 terms. All models showed excellent performance (area under the receiver operating characteristic ≥0.95, precision ≥0.94, recall ≥0.92, F1-score ≥0.94, and F2-score ≥0.92). The SVM and random forest classifiers achieved the highest F2-scores (0.948 and 0.947, respectively). The logistic regression and random forest were calibrated (P=.95 and P=.85, respectively), whereas the SVM and gradient boosted tree classifiers were miscalibrated (P=.03 and P<.001, respectively). "Fentanyl" and "accident" had the highest Shapley additive explanations values. Post hoc subgroup analyses revealed lower F2-scores for autopsies from forensic centers D and E. Lower F2-score were observed for the American Indian, Asian, ≤14 years, and ≥65 years subgroups, but larger sample sizes are needed to validate these findings. The random forest classifier may be suitable for identifying potential accidental and undetermined fatal overdose autopsies. Further validation studies should be conducted to ensure early detection of accidental and undetermined fatal drug overdoses across all subgroups.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JMIR public health and surveillance	Publication Date: May 19, 2023
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Using Natural Language Processing to Predict Fatal Drug Overdose From Autopsy Narrative Text: Algorithm Development and Validation Study.

Abstract

Talk to us

Similar Papers

More From: JMIR public health and surveillance

Lead the way for us

Similar Papers

Diphenhydramine-involved Fatal and Nonfatal Drug Overdoses in Tennessee, 2019–2022
Sarah Riley Saint ... Sutapa Mukhopadhyay
Substance Use & Misuse | VOL. 59
Sarah Riley Saint, et. al.Sarah Riley Saint ... Sutapa Mukhopadhyay
07 Nov 2023
Diphenhydramine-involved Fatal and Nonfatal Drug Overdoses in Tennessee, 2019–2022
Sarah Riley Saint ... Sutapa Mukhopadhyay

Suicide and fatal drug overdose in child sexual abuse victims: a historical cohort study
Margaret C Cutajar ... Josie Spataro
Medical Journal of Australia | VOL. 192
Margaret C Cutajar, et. al.Margaret C Cutajar ... Josie Spataro
01 Feb 2010
Medical Journal of Australia | VOL. 192

Prior Emergency Medical Services Utilization Among People Who Had an Accidental Opioid-Involved Fatal Drug Overdose-Rhode Island, 2018-2020.
Kailai Duan ... Melissa Basta
Public health reports (Washington, D.C. : 1974) | VOL. 139
Kailai Duan, et. al.Kailai Duan ... Melissa Basta
09 Mar 2023
Public health reports (Washington, D.C. : 1974) | VOL. 139

The impact of opioid agonist treatment on fatal and non-fatal drug overdose among people with a history of opioid dependence in NSW, Australia, 2001-2018: findings from the OATS retrospective linkage study.
Nicola Jones ... Matthew Hickman
International Journal of Population Data Science | VOL. 7
Nicola Jones, et. al.Nicola Jones ... Matthew Hickman
25 Aug 2022
International Journal of Population Data Science | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Natural Language Processing to Predict Fatal Drug Overdose From Autopsy Narrative Text: Algorithm Development and Validation Study.

Abstract

Talk to us

Similar Papers

More From: JMIR public health and surveillance