Abstract
BackgroundThe objective was to develop and assess performance of an algorithm predicting suicide-related ICD codes within three months of psychiatric discharge. MethodsThis prognostic study used a retrospective cohort of EHR data from 2789 youth (12 to 20 years old) hospitalized in a safety net institution in the Northeastern United States. The dataset combined structured data with unstructured data obtained through natural language processing of clinical notes. Machine learning approaches compared gradient boosting to random forest analyses. ResultsArea under the ROC and precision-recall curve were 0.88 and 0.17, respectively, for the final Gradient Boosting model. The cutoff point of the model-generated predicted probabilities of suicide that optimally classified the individual as high risk or not was 0.009. When applying the chosen cutoff (0.009) to the hold-out testing set, the model correctly identified 8 positive cases out of 10, and 418 negative cases out 548. The corresponding performance metrics showed 80 % sensitivity, 76 % specificity, 6 % PPV, 99 % NPV, F-1 score of 0.11, and an accuracy of 76 %. LimitationsThe data in this study comes from a single health system, possibly introducing bias in the model's algorithm. Thus, the model may have underestimated the incidence of suicidal behavior in the study population. Further research should include multiple system EHRs. ConclusionsThese performance metrics suggest a benefit to including both unstructured and structured data in design of predictive algorithms for suicidal behavior, which can be integrated into psychiatric services to help assess risk.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.