Abstract
BackgroundAs opioid prescriptions have risen, there has also been an increase in opioid use disorder (OUD) and its adverse outcomes. Accurate and complete epidemiologic surveillance of OUD, to inform prevention strategies, presents challenges. The objective of this study was to ascertain prevalence of OUD using two methods to identify OUD in electronic health records (EHR): applying natural language processing (NLP) for text mining of unstructured clinical notes and using ICD-10-CM diagnostic codes. MethodsData were drawn from EHR records for hospital and emergency department patient visits to a large regional academic medical center from 2017 to 2019. International Classification of Disease, 10th Edition, Clinic Modification (ICD-10-CM) discharge codes were extracted for each visit. To develop the rule-based NLP algorithm, a stepwise process was used. First, a small sample of visits from 2017 was used to develop initial dictionaries. Next, EHR corresponding to 30,124 visits from 2018 were used to develop and evaluate the rule-based algorithm. A random sample of the results were manually reviewed to identify and address shortcomings in the algorithm, and to estimate sensitivity and specificity of the two methods of ascertainment. Last, the final algorithm was then applied to 29,212 visits from 2019 to estimate OUD prevalence. ResultsWhile there was substantial overlap in the identified records (n = 1,381 [59.2 %]), overall n = 2,332 unique visits were identified. Of the total unique visits, 430 (18.4 %) were identified only by ICD-10-CM codes, and 521 (22.3 %) were identified only by NLP. The prevalence of visits with evidence of an OUD diagnosis in this sample, ascertained using only ICD-10-CM codes, was 1,811/29,212 (6.1 %). Including the additional 521 visits identified only by NLP, the estimated prevalence of OUD is 2,332/29,212 (7.9 %), an increase of 29.5 % compared to the use of ICD-10-CM codes alone. The estimated sensitivity and specificity of the NLP-based OUD classification were 81.8 % and 97.5 %, respectively, relative to gold-standard manual review by an expert addiction medicine physician. ConclusionNLP-based algorithms can automate data extraction and identify evidence of opioid use disorder from unstructured electronic healthcare records. The most complete ascertainment of OUD in EHR was combined NLP with ICD-10-CM codes. NLP should be considered for epidemiological studies involving EHR data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.