Data mining information from electronic health records produced high yield and accuracy for current smoking status

T Katrien J Groenhof,Pim A De Jong,Hendrik M Nathoe,A Titia Lely,Niels P Van Der Kaaij,Gert-Jan De Borst,Marianne C Verhaar,Ynte M Ruigrok,Laurien R Koers,Saskia Haitjema,Diederick E Grobbee,Michiel L Bots,Mark De Groot,Tim Leiner,Enja Blasse,Mirjam I Geerlings,Folkert W Asselbergs,L Jaap Kappelle,Jan Westerink,M H Emmelot ,Imo E Hoefer ,Frank L J Visseren ,Wouter W Van Solinge

doi:10.1016/j.jclinepi.2019.11.006

Abstract

ObjectivesResearchers are increasingly using routine clinical data for care evaluations and feedback to patients and clinicians. The quality of these evaluations depends on the quality and completeness of the input data. Study Design and SettingWe assessed the performance of an electronic health record (EHR)-based data mining algorithm, using the example of the smoking status in a cardiovascular population. As a reference standard, we used the questionnaire from the Utrecht Cardiovascular Cohort (UCC). To assess diagnostic accuracy, we calculated sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). ResultsWe analyzed 1,661 patients included in the UCC to January 18, 2019. Of those, 14% (n = 238) had missing information on smoking status in the UCC questionnaire. Data mining provided information on smoking status in 99% of the 1,661 participants. Diagnostic accuracy for current smoking was sensitivity 88%, specificity 92%, NPV 98%, and PPV 63%. From false positives, 85% reported they had quit smoking at the time of the UCC. ConclusionData mining showed great potential in retrieving information on smoking (a near complete yield). Its diagnostic performance is good for negative smoking statuses. The implications of misclassification with data mining are dependent on the application of the data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Clinical Epidemiology	Publication Date: Nov 12, 2019
Citations: 27	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Data mining information from electronic health records produced high yield and accuracy for current smoking status

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Epidemiology

Lead the way for us

Similar Papers

Performance of an artificial intelligence tool with real-time clinical workflow integration - Detection of intracranial hemorrhage and pulmonary embolism.
Nico Buls ... Koenraad Nieboer
Physica Medica | VOL. 83
Nico Buls, et. al.Nico Buls ... Koenraad Nieboer
01 Mar 2021
Physica Medica | VOL. 83

Accuracy of the “Triple Test” in the Diagnosis of Palpable Breast Masses in Saudi Females
Abdulrahman Saleh Al-Mulhim ... Adel Mohammed Ali
Annals of Saudi Medicine | VOL. 23
Abdulrahman Saleh Al-Mulhim, et. al.Abdulrahman Saleh Al-Mulhim ... Adel Mohammed Ali
01 May 2003
Annals of Saudi Medicine | VOL. 23

Assessing the performance of diagnostic test accuracy measures
Sofia Tsokani ... Dimitris Mavridis
American Journal of Orthodontics and Dentofacial Orthopedics | VOL. 161
Sofia Tsokani, et. al.Sofia Tsokani ... Dimitris Mavridis
23 Apr 2022
American Journal of Orthodontics and Dentofacial Orthopedics | VOL. 161

Cranial ultrasound and neurophysiological testing to predict neurological outcome in infants born very preterm.
Helen Franckx ... Daniele Hasaerts
Developmental medicine and child neurology | VOL. 60
Helen Franckx, et. al.Helen Franckx ... Daniele Hasaerts
07 Jul 2018
Developmental medicine and child neurology | VOL. 60

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data mining information from electronic health records produced high yield and accuracy for current smoking status

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Epidemiology