Abstract

Suicide is a leading cause of death in the US. However, official national statistics on suicide rates are delayed by 1 to 2 years, hampering evidence-based public health planning and decision-making. To estimate weekly suicide fatalities in the US in near real time. This cross-sectional national study used a machine learning pipeline to combine signals from several streams of real-time information to estimate weekly suicide fatalities in the US in near real time. This 2-phase approach first fits optimal machine learning models to each individual data stream and subsequently combines predictions made from each data stream via an artificial neural network. National-level US administrative data on suicide deaths, health services, and economic, meteorological, and online data were variously obtained from 2014 to 2017. Data were analyzed from January 1, 2014, to December 31, 2017. Longitudinal data on suicide-related exposures were obtained from multiple, heterogeneous streams: emergency department visits for suicide ideation and attempts collected via the National Syndromic Surveillance Program (2015-2017); calls to the National Suicide Prevention Lifeline (2014-2017); calls to US poison control centers for intentional self-harm (2014-2017); consumer price index and seasonality-adjusted unemployment rate, hourly earnings, home price index, and 3-month and 10-year yield curves from the Federal Reserve Economic Data (2014-2017); weekly daylight hours (2014-2017); Google and YouTube search trends related to suicide (2014-2017); and public posts on suicide on Reddit (2 314 533 posts), Twitter (9 327 472 tweets; 2015-2017), and Tumblr (1 670 378 posts; 2014-2017). Weekly estimates of suicide fatalities in the US were obtained through a machine learning pipeline that integrated the above data sources. Estimates were compared statistically with actual fatalities recorded by the National Vital Statistics System. Combining information from multiple data streams, the machine learning method yielded estimates of weekly suicide deaths with high correlation to actual counts and trends (Pearson correlation, 0.811; P < .001), while estimating annual suicide rates with low error (0.55%). The proposed ensemble machine learning framework reduces the error for annual suicide rate estimation to less than one-tenth of that of current forecasting approaches that use only historical information on suicide deaths. These findings establish a novel approach for tracking suicide fatalities in near real time and provide the potential for an effective public health response such as supporting budgetary decisions or deploying interventions.

Highlights

  • Suicide is a significant contributor to global mortality, resulting in nearly 800 000 deaths worldwide each year.[1]

  • The proposed ensemble machine learning framework reduces the error for annual suicide rate estimation to less than one-tenth of that of current forecasting approaches that use only historical information on suicide deaths

  • These findings establish a novel approach for tracking suicide fatalities in near real time and provide the potential for an effective public health response such as supporting budgetary decisions or deploying interventions

Read more

Summary

Introduction

Suicide is a significant contributor to global mortality, resulting in nearly 800 000 deaths worldwide each year.[1] Currently, suicide rates in the US are at their highest levels in more than 50 years after having experienced a relatively rapid, recent increase of 35% from 1999 to 2018 alone.[2,3] Despite the urgency of this public health problem, real-time information on suicide fatality trends to guide prevention efforts is lacking This is because national statistics on suicide rates are delayed by 1 to 2 years.[4,5] Official statistics on suicide fatalities are produced by the Centers for Disease Control and Prevention using data from the National Vital Statistics System. Reasons for the delay in suicide fatality data include the decentralized nature of mortality statistics, critical shortages of workers such as forensic pathologists, variation in local information technology infrastructure, and the time needed to investigate deaths due to suicide.[4,7,8]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.