Machine learning to identify chronic cough from administrative claims data

Vishal Bali,Vladimir Turzhitsky,Jonathan Schelfhout,Misti Paudel,Erin Hulbert,Jesse Peterson-Brandt,Jeffrey Hertzberg,Neal R Kelly,Raja H Patel

doi:10.1038/s41598-024-51522-9

Abstract

Accurate identification of patient populations is an essential component of clinical research, especially for medical conditions such as chronic cough that are inconsistently defined and diagnosed. We aimed to develop and compare machine learning models to identify chronic cough from medical and pharmacy claims data. In this retrospective observational study, we compared 3 machine learning algorithms based on XG Boost, logistic regression, and neural network approaches using a large claims and electronic health record database. Of the 327,423 patients who met the study criteria, 4,818 had chronic cough based on linked claims–electronic health record data. The XG Boost model showed the best performance, achieving a Receiver-Operator Characteristic Area Under the Curve (ROC-AUC) of 0.916. We selected a cutoff that favors a high positive predictive value (PPV) to minimize false positives, resulting in a sensitivity, specificity, PPV, and negative predictive value of 18.0%, 99.6%, 38.7%, and 98.8%, respectively on the held-out testing set (n = 82,262). Logistic regression and neural network models achieved slightly lower ROC-AUCs of 0.907 and 0.838, respectively. The XG Boost and logistic regression models maintained their robust performance in subgroups of individuals with higher rates of chronic cough. Machine learning algorithms are one way of identifying conditions that are not coded in medical records, and can help identify individuals with chronic cough from claims data with a high degree of classification value.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine learning to identify chronic cough from administrative claims data

Abstract

Talk to us

Similar Papers

More From: Scientific reports

Lead the way for us

Journal: Scientific reports	Publication Date: Jan 30, 2024
License type: CC BY 4.0

Similar Papers

Agreement and validity of electronic health record prescribing data relative to pharmacy claims data: A validation study from a US electronic health record database.
Christopher G Rowan ... John K Cuddeback
Pharmacoepidemiology and Drug Safety | VOL. 26
Christopher G Rowan, et. al.Christopher G Rowan ... John K Cuddeback
12 Jun 2017
Pharmacoepidemiology and Drug Safety | VOL. 26

The Use of a Large Electronic Health Records System to Define and Characterize a Thrombotic Thrombocytopenic Purpura Population
Andrew Bevan ... Virgil Rose
Blood | VOL. 142
Andrew Bevan, et. al.Andrew Bevan ... Virgil Rose
02 Nov 2023
Blood | VOL. 142

The values of serum human epididymis secretory protein 4 and CA(125) assay in the diagnosis of ovarian malignancy
Li Dong ... Heng Cui
Zhonghua fu chan ke za zhi | VOL. 43
Li Dong, et. al.Li Dong ... Heng Cui
01 Dec 2008
The values of serum human epididymis secretory protein 4 and CA(125) assay in the diagnosis of ovarian malignancy
Li Dong ... Heng Cui

Validity of International Classification of Diseases Codes for Identifying Neuro-Ophthalmic Disease in Large Data Sets: A Systematic Review.
Ali G Hamedani ... Heather E Moss
Journal of Neuro-Ophthalmology | VOL. 40
Ali G Hamedani, et. al.Ali G Hamedani ... Heather E Moss
19 Jun 2020
Journal of Neuro-Ophthalmology | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning to identify chronic cough from administrative claims data

Abstract

Talk to us

Similar Papers

More From: Scientific reports