Abstract

Identifying when an incident diabetes (DM) diagnosis was made is complicated using retrospective, structured electronic health record (EHR) data alone. Unstructured clinical notes have been underused but contain valuable information that could complement traditional methods. However, manually reviewing clinical notes is time-consuming. We developed and validated a simple rule-based Natural Language Processing (NLP) method to extract incident DM timing from clinical notes. In a single center we used structured EHR data to identify a cohort (age <45 as of 12/31/19) with likely type 1 (T1D) or type 2 (T2D) DM based on 2016-2019 records: (≥1 T1D ICD-10 code and insulin and no other DM medication) or ([≥2 T2D and no T1D codes] or [≥1 T2D code and a DM medication besides insulin or metformin]). This cohort had 2,654 patients (548,316 clinical notes, 2003-present). We randomly selected 58,450 clinical notes (1,465 patients) as a training set to look for relevant text patterns. We handcrafted the rules into our NLP tool. We required 3 distinct concepts at the sentence level to determine an incident DM diagnosis: DM (not, e.g., epilepsy), an onset attribute (e.g., “diagnosed in”), and a temporal component (e.g., 8/2008). We pre-defined all related keywords and date formats for these concepts in our training notes. We then tested the NLP algorithm against manual review in an independent set of 100 randomly selected patients from the cohort. Analysis was at the patient level (true+: ≥1 true+ note per patient). NLP in the training set found 1,268 patients with at least 1 of the 3 concepts and 826 patients with all 3. In the test set, we excluded 4 patients without substantive notes. NLP correctly detected incident DM timing in 73 of 96 patients. The NLP had recall 88%, specificity 77%, precision (PPV) 96%, and NPV 50%. NLP was helpful in finding incident DM timing and may complement structured EHR queries for identifying incident DM. Refinement of our NLP algorithm is ongoing. Disclosure A.Wong: None. V.W.Zhong: None. M.Rosenman: None. Funding Centers for Disease Control and Prevention (1U18DP006693-01-00)

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.