Classifying the lifestyle status for Alzheimer’s disease from clinical notes using deep learning with weak supervision

Zitao Shen,Dalton Schutte,Yoonkwon Yi,Anusha Bompelli,Fang Yu,Yanshan Wang,Rui Zhang

doi:10.1186/s12911-022-01819-4

Abstract

BackgroundSince no effective therapies exist for Alzheimer’s disease (AD), prevention has become more critical through lifestyle status changes and interventions. Analyzing electronic health records (EHRs) of patients with AD can help us better understand lifestyle’s effect on AD. However, lifestyle information is typically stored in clinical narratives. Thus, the objective of the study was to compare different natural language processing (NLP) models on classifying the lifestyle statuses (e.g., physical activity and excessive diet) from clinical texts in English.MethodsBased on the collected concept unique identifiers (CUIs) associated with the lifestyle status, we extracted all related EHRs for patients with AD from the Clinical Data Repository (CDR) of the University of Minnesota (UMN). We automatically generated labels for the training data by using a rule-based NLP algorithm. We conducted weak supervision for pre-trained Bidirectional Encoder Representations from Transformers (BERT) models and three traditional machine learning models as baseline models on the weakly labeled training corpus. These models include the BERT base model, PubMedBERT (abstracts + full text), PubMedBERT (only abstracts), Unified Medical Language System (UMLS) BERT, Bio BERT, Bio-clinical BERT, logistic regression, support vector machine, and random forest. The rule-based model used for weak supervision was tested on the GSC for comparison. We performed two case studies: physical activity and excessive diet, in order to validate the effectiveness of BERT models in classifying lifestyle status for all models were evaluated and compared on the developed Gold Standard Corpus (GSC) on the two case studies.ResultsThe UMLS BERT model achieved the best performance for classifying status of physical activity, with its precision, recall, and F-1 scores of 0.93, 0.93, and 0.92, respectively. Regarding classifying excessive diet, the Bio-clinical BERT model showed the best performance with precision, recall, and F-1 scores of 0.93, 0.93, and 0.93, respectively.ConclusionThe proposed approach leveraging weak supervision could significantly increase the sample size, which is required for training the deep learning models. By comparing with the traditional machine learning models, the study also demonstrates the high performance of BERT models for classifying lifestyle status for Alzheimer’s disease in clinical notes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC medical informatics and decision making	Publication Date: Jul 1, 2022
Citations: 15	License type: open-access

R Discovery Prime

R Discovery Prime

Classifying the lifestyle status for Alzheimer’s disease from clinical notes using deep learning with weak supervision

Abstract

Talk to us

Similar Papers

More From: BMC medical informatics and decision making

Lead the way for us

Similar Papers

Engineering Document Summarization Using Sentence Representations Generated by Bidirectional Language Model
Yan Jin ... Yunjian Qiu
-
Yan Jin, et. al.Yan Jin ... Yunjian Qiu
17 Aug 2021
17 Aug 2021

Identification of asthma control factor in clinical notes using a hybrid deep learning model
Bhavani Singh Agnikula Kshatriya ... Chung-Il Wi
BMC Medical Informatics and Decision Making | VOL. 21
Bhavani Singh Agnikula Kshatriya, et. al.Bhavani Singh Agnikula Kshatriya ... Chung-Il Wi
01 Nov 2021
BMC Medical Informatics and Decision Making | VOL. 21

Oversampling effect in pretraining for bidirectional encoder representations from transformers (BERT) to localize medical BERT and enhance biomedical BERT
Shoya Wada ... Yasushi Matsumura
Artificial Intelligence In Medicine | VOL. 153
Shoya Wada, et. al.Shoya Wada ... Yasushi Matsumura
05 May 2024
Artificial Intelligence In Medicine | VOL. 153

Bert model fine-tuning for text classification in knee OA radiology reports
L Chen ... V Pedoia
Osteoarthritis and Cartilage | VOL. 28
L Chen, et. al.L Chen ... V Pedoia
01 Apr 2020
Osteoarthritis and Cartilage | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classifying the lifestyle status for Alzheimer’s disease from clinical notes using deep learning with weak supervision

Abstract

Talk to us

Similar Papers

More From: BMC medical informatics and decision making