Abstract

BackgroundWe developed transformer-based deep learning models based on natural language processing for early risk assessment of Alzheimer’s disease from the picture description test.MethodsThe lack of large datasets poses the most important limitation for using complex models that do not require feature engineering. Transformer-based pre-trained deep language models have recently made a large leap in NLP research and application. These models are pre-trained on available large datasets to understand natural language texts appropriately, and are shown to subsequently perform well on classification tasks with small training sets. The overall classification model is a simple classifier on top of the pre-trained deep language model.ResultsThe models are evaluated on picture description test transcripts of the Pitt corpus, which contains data of 170 AD patients with 257 interviews and 99 healthy controls with 243 interviews. The large bidirectional encoder representations from transformers (BERTLarge) embedding with logistic regression classifier achieves classification accuracy of 88.08%, which improves the state-of-the-art by 2.48%.ConclusionsUsing pre-trained language models can improve AD prediction. This not only solves the problem of lack of sufficiently large datasets, but also reduces the need for expert-defined features.

Highlights

  • We developed transformer-based deep learning models based on natural language processing for early risk assessment of Alzheimer’s disease from the picture description test

  • Dataset The models are evaluated on the transcripts of the Cookie-Theft picture description test of the Pitt corpus from the DementiaBank dataset, which contains 170 possible or probable Alzheimer’s disease (AD) patients with 257 interviews and 99 healthy control (HC) participants with 243 interviews

  • The first problem with using the entire corpus is that the corpus is highly unbalanced, and as a result, a naïve classifier that always outputs AD labels can achieve a classification accuracy of 78% on such a dataset

Read more

Summary

Introduction

We developed transformer-based deep learning models based on natural language processing for early risk assessment of Alzheimer’s disease from the picture description test. The healthcare industry has quickly realized the importance of data and as a result has started collecting them through a variety of methods such as electronic health records (EHR), sensors, and other sources. Analyzing these data and making decisions based on them is very time consuming and complicated. There is a large amount of information and hidden relationships in these textual data, and extracting this information is difficult for humans In this regard, the use of machine learning and natural language processing (NLP) to analyze these data and inference based on the performed analysis has received increased attention. Given the importance of the impact of AD on speech abilities of the patients, this study aims to develop a technique for AD risk assessment from transcripts of targeted speech elicited from the participants

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call