Abstract

BackgroundIndia is home to 20% of the world’s suicide deaths. Although statistics regarding suicide in India are distressingly high, data and cultural issues likely contribute to a widespread underreporting of the problem. Social stigma and only recent decriminalization of suicide are among the factors hampering official agencies’ collection and reporting of suicide rates.ObjectiveAs the product of a data collaborative, this paper leverages private-sector search engine data toward gaining a fuller, more accurate picture of the suicide issue among young people in India. By combining official statistics on suicide with data generated through search queries, this paper seeks to: add an additional layer of information to more accurately represent the magnitude of the problem, determine whether search query data can serve as an effective proxy for factors contributing to suicide that are not represented in traditional datasets, and consider how data collaboratives built on search query data could inform future suicide prevention efforts in India and beyond.MethodsWe combined official statistics on demographic information with data generated through search queries from Bing to gain insight into suicide rates per state in India as reported by the National Crimes Record Bureau of India. We extracted English language queries on “suicide,” “depression,” “hanging,” “pesticide,” and “poison”. We also collected data on demographic information at the state level in India, including urbanization, growth rate, sex ratio, internet penetration, and population. We modeled the suicide rate per state as a function of the queries on each of the 5 topics considered as linear independent variables. A second model was built by integrating the demographic information as additional linear independent variables.ResultsResults of the first model fit (R2) when modeling the suicide rates from the fraction of queries in each of the 5 topics, as well as the fraction of all suicide methods, show a correlation of about 0.5. This increases significantly with the removal of 3 outliers and improves slightly when 5 outliers are removed. Results for the second model fit using both query and demographic data show that for all categories, if no outliers are removed, demographic data can model suicide rates better than query data. However, when 3 outliers are removed, query data about pesticides or poisons improves the model over using demographic data.ConclusionsIn this work, we used search data and demographics to model suicide rates. In this way, search data serve as a proxy for unmeasured (hidden) factors corresponding to suicide rates. Moreover, our procedure for outlier rejection serves to single out states where the suicide rates have substantially different correlations with demographic factors and query rates.

Highlights

  • BackgroundAccording to the World Health Organization (WHO), close to 800,000 people die by suicide every year, with 78% of global suicides occurring in low- and middle-income countries [1]

  • We combined official statistics on demographic information with data generated through search queries from Bing to gain insight into suicide rates per state in India as reported by the National Crimes Record Bureau of India

  • Search data serve as a proxy for unmeasured factors corresponding to suicide rates

Read more

Summary

Introduction

BackgroundAccording to the World Health Organization (WHO), close to 800,000 people die by suicide every year, with 78% of global suicides occurring in low- and middle-income countries [1]. Teenagers and young adolescents are at risk, as suicide represents the second leading cause of death among 15-29-year-olds worldwide [1]. These concerning figures do not even fully capture the magnitude of the problem. India is home to 20% of the world’s suicide deaths [2], yet the issue attracts limited national public health attention [3]. The WHO reported 170,000 cases of suicide deaths in India, which is about 35,000 higher than the NCRB’s data [3]. Statistics regarding suicide in India are distressingly high, data and cultural issues likely contribute to a widespread underreporting of the problem. Social stigma and only recent decriminalization of suicide are among the factors hampering official agencies’ collection and reporting of suicide rates

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call