Using Spontaneous Speech Recognition as a Biomarker to Distinguish Dementia Patients

Cynthia Nyongesa,Judy Pa

doi:10.1002/alz.082145

Abstract

AbstractBackgroundDigital biomarkers such as individual speech recordings, are a cost‐effective tool for assessing cognitive status using linguistic and acoustic content. The purpose of this study is to investigate whether Natural Language Processing (NLP) methods and audio feature extraction can distinguish spontaneous speech outputs in dementia patients from a healthy population.Method240 audio samples and transcripts of clinician‐participant dialogues involving the Cookie‐Theft visual description were acquired from the DementiaBank Pitt corpus (Dementia group (DG) n = 169; healthy controls (CG) n = 71). Lexical variety, syntactic complexity, salience of information, disfluency patterns, and sentiment analysis were evaluated. Sound synthesis and an acoustic R analysis package were used to filter and examine acoustic properties of recorded audio such as silence and speech segmentation, pitch tracking, and frequency modulation of vibrato or jitter. DG and CG were compared using non‐parametric Kruskal‐Wallis tests, Dunn’s multiple comparison, and Mann‐Whitney U‐test in the statistical analysis. In comparison to CG, we hypothesized that DG would differ in levels of lexical diversity, grammatical complexity, and overall salience.ResultOverall, text analysis showed no variations in disfluencies between groups, but there were significant differences in lexical diversity, syntactic complexity and reference rate to reality (p<0.0001). Additionally, there were no appreciable differences in the frequency of salient information between the two groups, however the frequency of non‐salient information and disfluencies was considerably higher in the DG (p<0.05). Analysis of acoustic features revealed that DG had longer average speech duration and speech segment durations compared to CG (p<0.001).. On the other hand, the rate of change of acoustic characteristics including pitch, mean entropy, and harmonics‐to‐noise ratios did not significantly differ between DG and CG.ConclusionIn conclusion, this paradigm reveals that, in comparison to healthy controls, dementia patients exhibited increased lexical diversity yet lower linguistic complexity, and a shift towards characterizing non‐salient objects in given stimuli, as well as longer segments of speech. Further examination of the rhythmic and acoustic characteristics of speech may provide a comprehensive framework towards defining speech as a potential digital biomarker for dementia risk, and provide applications to regions experiencing increasing socioeconomic disparities.

Full Text