Abstract

AbstractBackgroundStudies focused on understanding cerebrospinal fluid (CSF) biomarkers of presymptomatic disease processes in Alzheimer’s disease (AD) are restricted by sample sizes that lack statistical power to investigate subtle effects of interest pertaining to risk. Pooling data across cohorts may be effective to identify early trends in CSF biomarkers to subsequent neurodegeneration and cognitive decline. Unfortunately, different instruments, preanalytics and workflows at each site lead to unique challenges when analysis of a pooled dataset is undertaken. We report initial findings from harmonized pooling of CSF measurements (cross‐sectional) across three preclinical AD studies at different institutions on a combined cohort size of N=733.MethodCSF measurements were contributed from three studies: Adult Children Study (ACS), Wisconsin Registry for Alzheimer’s Prevention (WRAP) and Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD). For harmonization, ACS was the reference site to which the CSF measurements from WRAP and BIOCARD were transformed using our domain adaptation algorithm (affine transformation to align covariate‐matched distributions under a distributional discrepancy measure). Parameters are optimized using Stochastic Gradient Descent in PyTorch.Resulta) Figure 1 shows each CSF measure as a function of age for each study (using the first CSF timepoint). The trend lines as age progress are preserved in the harmonized pooled data (Figures1‐ 2). b) Figure 3 shows each CSF measure before/after harmonization. c) Figure 4 includes scatter plots of CSF measures before/after harmonization using a categorical coding for APOE e4 status. We see that the shape of the distribution for each measure is preserved.ConclusionThese results show the feasibility of aligning CSF values from different cohorts and assay platforms (Lumipulse G1200 assay in ACS and Biocard and Roche Elecsys in WRAP). Mathematically rigorous schemes for analysis of pooled datasets based on modern developments in machine learning are facilitating AD research. We have shown that deploying such ideas for harmonizing/pooling CSF measures across preclinical AD studies appears viable. Our ongoing investigations are focused on understanding and quantifying the degree of improvements such schemes may offer for downstream statistical analysis tasks such as longitudinal trajectories of these measures as a function of demographic variables and risk factors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call