1. Machine learning for comprehensive forecasting of Alzheimer’s Disease progression
- Author
-
Fisher, Charles K, Smith, Aaron M, and Walsh, Jonathan R
- Subjects
Information and Computing Sciences ,Biological Sciences ,Machine Learning ,Aging ,Brain Disorders ,Acquired Cognitive Impairment ,Neurodegenerative ,Alzheimer's Disease including Alzheimer's Disease Related Dementias (AD/ADRD) ,Alzheimer's Disease ,Dementia ,Good Health and Well Being ,Aged ,Aged ,80 and over ,Alzheimer Disease ,Cognition ,Cognitive Dysfunction ,Disease Progression ,Female ,Forecasting ,Humans ,Male ,Middle Aged ,Models ,Statistical ,Models ,Theoretical ,Neuropsychological Tests ,Coalition Against Major Diseases ,Abbott ,Alliance for Aging Research ,Alzheimer’s Association ,Alzheimer’s Foundation of America ,AstraZeneca Pharmaceuticals LP ,Bristol-Myers Squibb Company ,Critical Path Institute ,CHDI Foundation ,Inc. ,Eli Lilly and Company ,F. Hoffmann-La Roche Ltd ,Forest Research Institute ,Genentech ,Inc. ,GlaxoSmithKline ,Johnson & Johnson ,National Health Council ,Novartis Pharmaceuticals Corporation ,Parkinson’s Action Network ,Parkinson’s Disease Foundation ,Pfizer ,Inc. ,sanofi-aventis. Collaborating Organizations: Clinical Data Interchange Standards Consortium (CDISC) ,Ephibian ,Metrum Institute. - Abstract
Most approaches to machine learning from electronic health data can only predict a single endpoint. The ability to simultaneously simulate dozens of patient characteristics is a crucial step towards personalized medicine for Alzheimer's Disease. Here, we use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to simulate detailed patient trajectories. We use data comprising 18-month trajectories of 44 clinical variables from 1909 patients with Mild Cognitive Impairment or Alzheimer's Disease to train a model for personalized forecasting of disease progression. We simulate synthetic patient data including the evolution of each sub-component of cognitive exams, laboratory tests, and their associations with baseline clinical characteristics. Synthetic patient data generated by the CRBM accurately reflect the means, standard deviations, and correlations of each variable over time to the extent that synthetic data cannot be distinguished from actual data by a logistic regression. Moreover, our unsupervised model predicts changes in total ADAS-Cog scores with the same accuracy as specifically trained supervised models, additionally capturing the correlation structure in the components of ADAS-Cog, and identifies sub-components associated with word recall as predictive of progression.
- Published
- 2019