Abstract

AbstractBackgroundFree‐text fields in electronic medical records (EMRs) are a rich source of information about persons with dementia. The signs and symptoms of dementia (e.g., responsive behaviours, cognitive impairment) can present to primary care providers many years before a formal diagnosis. We used natural language processing (NLP) to develop a list of features (i.e., dementia‐related key words) and compare classification algorithms to identify persons with dementia based on signs and symptoms documented in primary care EMRs.MethodWe used a validated algorithm based on administrative data to identify 526 persons with incident dementia (known positives) and 44,148 persons without (known negatives) aged 66+ from a primary care EMR database in Ontario, Canada between April 2010 and March 2018. A list of 900+ features associated with dementia was developed using literature review, clinician input and associated word embeddings. We trained a series of classification algorithms (e.g., gradient boosted models, neural networks, lasso and ridge regression) separately in progress notes and consult notes and compared their performance using nested 10‐fold cross validation.ResultPersons with dementia were older (mean:80.3 vs. 74.6 years) and more likely to have 5+ chronic conditions (11.6% vs. 7.8%). Persons with dementia had a median of 30.3 features per progress note (IQR:23.8, 40.4) and 54.7 per consult note (IQR:26.6, 83.8) compared to 27.5 (IQR:21.3, 36.5) and 32.1 (IQR:14.0, 55.6) for persons without dementia. Out of eight thematic groups (cognition, social, health system use, function, medication‐dementia, medication, symptoms, other), persons with dementia showed substantially more features related to cognition, social and medication‐dementia in progress and consult notes compared to persons without dementia. Using progress notes, the classification algorithm involving neural networks showed the best performance (Sensitivity:66.2%, Positive Predictive Value [PPV]:81.3%). Using consult notes, the gradient‐boosted classifier performed best (Sensitivity:45.4%, PPV:66.5%).ConclusionWe used NLP to discover informative features and develop classification algorithms to identify persons with dementia using free‐text EMR data. This could be used to improve recognition of early signs and symptoms of dementia by primary care providers to provide patients with appropriate interventions, including assessments, imaging and specialist referrals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call