Analysis and Classification of Word Co-Occurrence Networks From Alzheimer’s Patients and Controls

Tristan Millington,Saturnino Luz

doi:10.3389/fcomp.2021.649508

Tristan Millington, Saturnino Luz

Open Access

https://doi.org/10.3389/fcomp.2021.649508

Copy DOI

Journal: Frontiers in computer science	Publication Date: Apr 29, 2021
Citations: 7	License type: CC BY 4.0

Affiliation: University of Edinburgh

Abstract

In this paper we construct word co-occurrence networks from transcript data of controls and patients with potential Alzheimer’s disease using the ADReSS challenge dataset of spontaneous speech. We examine measures of the structure of these networks for significant differences, finding that networks from Alzheimer’s patients have a lower heterogeneity and centralization, but a higher edge density. We then use these measures, a network embedding method and some measures from the word frequency distribution to classify the transcripts into control or Alzheimer’s, and to estimate the cognitive test score of a participant based on the transcript. We find it is possible to distinguish between the AD and control networks on structure alone, achieving 66.7% accuracy on the test set, and to predict cognitive scores with a root mean squared error of 5.675. Using the network measures is more successful than using the network embedding method. However, if the networks are shuffled we find relatively few of the measures are different, indicating that word frequency drives many of the network properties. This observation is borne out by the classification experiments, where word frequency measures perform similarly to the network measures.

Highlights

As populations continue to age, the development of automated methods to help reduce the amount of in person care required is becoming an important research topic
We show the means of these for each group for a variety of cooccurrence windows (o) in Table 1, where bold font indicates the mean difference is significant according to a Mann-Whitley test at p < 0.05 for that co-occurrence window
In this paper we have constructed word co-occurrence networks using transcript data from both controls and Alzheimer’s patients on a picture description task. With these networks we have analyzed some measures of their structure, and used some embedding methods to enable classification of the networks and to predict the mini-mental state examination (MMSE) score from the transcript

Summary

Introduction

As populations continue to age, the development of automated methods to help reduce the amount of in person care required is becoming an important research topic. Dementia is a particular issue, where the cognitive function of a person declines as they age, with symptoms including memory loss, motor problems, deterioration of visuospatial function, language impairment and emotional distress. These issues tend to reduce the ability of a person to care for themselves, placing an added burden on their carers and/or relatives. Dementia shows various linguistic effects, with patients tending to produce sentences with less information, less syntactic complexity (Pakhomov et al, 2011), fewer unique words and more meaningless sentences (Fraser et al, 2016) These effects can be used for non-invasive diagnosis and analysis of dementia, and so in this paper we look at using text classification methods to this end

Methods

Results

Conclusion