Abstract

BackgroundThe impending scale up of noncommunicable disease screening programs in low- and middle-income countries coupled with limited health resources require that such programs be as accurate as possible at identifying patients at high risk.ObjectiveThe aim of this study was to develop machine learning–based risk stratification algorithms for diabetes and hypertension that are tailored for the at-risk population served by community-based screening programs in low-resource settings.MethodsWe trained and tested our models by using data from 2278 patients collected by community health workers through door-to-door and camp-based screenings in the urban slums of Hyderabad, India between July 14, 2015 and April 21, 2018. We determined the best models for predicting short-term (2-month) risk of diabetes and hypertension (a model for diabetes and a model for hypertension) and compared these models to previously developed risk scores from the United States and the United Kingdom by using prediction accuracy as characterized by the area under the receiver operating characteristic curve (AUC) and the number of false negatives.ResultsWe found that models based on random forest had the highest prediction accuracy for both diseases and were able to outperform the US and UK risk scores in terms of AUC by 35.5% for diabetes (improvement of 0.239 from 0.671 to 0.910) and 13.5% for hypertension (improvement of 0.094 from 0.698 to 0.792). For a fixed screening specificity of 0.9, the random forest model was able to reduce the expected number of false negatives by 620 patients per 1000 screenings for diabetes and 220 patients per 1000 screenings for hypertension. This improvement reduces the cost of incorrect risk stratification by US $1.99 (or 35%) per screening for diabetes and US $1.60 (or 21%) per screening for hypertension.ConclusionsIn the next decade, health systems in many countries are planning to spend significant resources on noncommunicable disease screening programs and our study demonstrates that machine learning models can be leveraged by these programs to effectively utilize limited resources by improving risk stratification.

Highlights

  • Noncommunicable diseases, including diabetes, hypertension, and cardiovascular disease, are a global health priority [1]

  • We developed new risk stratification algorithms that are tailored for community-based screening programs in low- and middle-income countries with limited screening data

  • We focused on the random forest model when comparing with baseline approaches for both diabetes and hypertension

Read more

Summary

Introduction

Noncommunicable diseases, including diabetes, hypertension, and cardiovascular disease, are a global health priority [1]. Noncommunicable diseases disproportionally affect low- and middle-income countries, wherein more than 75% of all noncommunicable disease deaths (~31 million per year) occur, including over 16 million annual deaths in adults between the ages of 30 years and 69 years [1]. Health systems in many low- and middle-income countries are already overburdened with an unfinished agenda on infectious diseases [7] and do not have enough capacity to conduct national-level noncommunicable disease screening programs [8]. The impending scale up of noncommunicable disease screening programs in low- and middle-income countries coupled with limited health resources require that such programs be as accurate as possible at identifying patients at high risk

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call