Abstract

Most approaches to machine learning from electronic health data can only predict a single endpoint. The ability to simultaneously simulate dozens of patient characteristics is a crucial step towards personalized medicine for Alzheimer’s Disease. Here, we use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to simulate detailed patient trajectories. We use data comprising 18-month trajectories of 44 clinical variables from 1909 patients with Mild Cognitive Impairment or Alzheimer’s Disease to train a model for personalized forecasting of disease progression. We simulate synthetic patient data including the evolution of each sub-component of cognitive exams, laboratory tests, and their associations with baseline clinical characteristics. Synthetic patient data generated by the CRBM accurately reflect the means, standard deviations, and correlations of each variable over time to the extent that synthetic data cannot be distinguished from actual data by a logistic regression. Moreover, our unsupervised model predicts changes in total ADAS-Cog scores with the same accuracy as specifically trained supervised models, additionally capturing the correlation structure in the components of ADAS-Cog, and identifies sub-components associated with word recall as predictive of progression.

Highlights

  • Most approaches to machine learning from electronic health data can only predict a single endpoint

  • We extracted 18-month longitudinal trajectories of 1909 patients with Mild Cognitive Impairment (MCI) or Alzheimer’s Disease (AD) covering 44 variables including the individual components of the Alzheimer’s Disease Assessment Scale (ADAS)-Cog and Mini Mental State Exam (MMSE) scores, laboratory tests, and background information

  • We focus on disease progression as assessed by the overall ADAS-Cog[11] score rather than the individual components

Read more

Summary

Introduction

Most approaches to machine learning from electronic health data can only predict a single endpoint. We use data comprising 18-month trajectories of 44 clinical variables from 1909 patients with Mild Cognitive Impairment or Alzheimer’s Disease to train a model for personalized forecasting of disease progression. Computational models of disease progression developed using machine learning approaches provide an attractive tool to combat such patient heterogeneity. One day these computational models may be used to guide clinical decisions; current applications are limited both by the availability of data and by the ability of algorithms to extract insights from those data. Most clinical datasets contain multiple types of data (i.e., they are “multimodal”), have a relatively small number of samples, and many missing observations Dealing with these issues typically requires extensive preprocessing[3] or discarding variables that are too difficult to model. The heterogeneity of AD and related dementias makes these diseases difficult to www.nature.com/scientificreports/

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.