AbstractBackgroundLarge‐scale multi‐study analyses are required to ensure reproducibility, reliability and generalizability in mapping neurodegeneration and risk for ADRD. However, the heterogeneity in data collection paradigms can complicate and confound data pooling; data harmonization is essential. Longitudinal studies add to the complexity of harmonization as a variety of follow‐up time points and time encoding schemes are used. Here, we pool data from 8 publicly available longitudinal neuroimaging studies on neurodegeneration and dementia to highlight differences in: 1) diagnostic categorization of controls, and people with mild cognitive impairment, and dementia; 2) the extent of follow‐up visits across neuroimaging and clinical assessments; and 3) encodings of various meta‐data elements including scanner manufacturer, sex, and handedness. To allow a systematic approach to multi‐study dementia research, we propose an initial ontological framework for longitudinal data archival and retrieval, capable of capturing similar data elements within given themes while ensuring that unique differences across studies are retained.MethodAIBL, ADNI‐1, ADNI2/GO, ADNI‐3, OASIS‐2, OASIS‐3, PREVENT‐AD, and MIRIAD, were accessed. Study designs, inclusion and exclusion criteria, and downloaded data elements were used to describe diagnostic criteria, imaging data, demographic features, and longitudinal data collection schemes. We create common terms to map data elements across studies, and a consistent naming scheme for longitudinal data.ResultFigure 1 shows the breakdown of diagnostic labels per cohort, and highlights how different cognitive labels may be assigned to people with the same performance scores, for example, on the Mini Mental State Examination (MMSE). Figure 2 highlights the differences in longitudinal data collection schemes across cohorts, and in labels of commonly collected elements. Using common terms provided an easy way to simultaneously query data across all studies, while retaining a map back to the original terms. Figure 3 showcases our proposed naming scheme for capturing longitudinal data elements using a relational ontological framework. This framework successfully harmonized data across study demographics, timepoints, and imaging properties.ConclusionMulti‐study data analyses are becoming common practice thanks to large scale efforts such as ENIGMA and GAAIN. Harmonization of meta‐data across studies will allow for efficient pooling of data for tracking disease progression and related risk over time.
Read full abstract