AbstractBackgroundVoice disorders are common in People Living with Dementia (PLwD) and a decrease in voice quality can be an indicator of cognitive deterioration. Previous computational works have successfully employed voice to detect or predict dementia. Voice quality can be assessed by a speech therapist, or computationally when subjects periodically repeat the same sentences. There is, however, to our knowledge, no research on monitoring voice to track dementia progress from unconstrained data collected in the wild.MethodIn this work, we employ a dataset containing audio recordings of 14 households (a carer, a PLwD and visitors) interacting with Alexa. We identify several voice features, such as speech rate and OpenSmile features that differ between healthy users and PLwD. We create a voice footprint for each PLwD using a Gaussian distribution and identify audio recordings that significantly differ from this baseline over time by combining Expectation‐Minimisation and Mahalanolis distance outliers detection.ResultThe voice features selected are significantly different between our two user groups (p‐value < 0.0001). Training a classifier on those features, allows us to identify healthy subjects and PLwD with over 90% accuracy. We are also able to quality how much the voice differs from its baseline over time.ConclusionWhile we can computationally identify voice variations that are out the ordinary, in the future, we will compare our results to a clinical assessment of voice quality. We will also integrate a semantics analysis to identify grammar errors or the use of wrong words. This will allow us to design an objective index measure for monitoring of speech quality.