Abstract
Gene expression fundamentally shapes the structural and functional architecture of the human brain. Open-access transcriptomic datasets like the Allen Human Brain Atlas provide an unprecedented ability to examine these mechanisms in vivo; however, a lack of standardization across research groups has given rise to myriad processing pipelines for using these data. Here, we develop the abagen toolbox, an open-access software package for working with transcriptomic data, and use it to examine how methodological variability influences the outcomes of research using the Allen Human Brain Atlas. Applying three prototypical analyses to the outputs of 750,000 unique processing pipelines, we find that choice of pipeline has a large impact on research findings, with parameters commonly varied in the literature influencing correlations between derived gene expression and other imaging phenotypes by as much as ρ ≥ 1.0. Our results further reveal an ordering of parameter importance, with processing steps that influence gene normalization yielding the greatest impact on downstream statistical inferences and conclusions. The presented work and the development of the abagen toolbox lay the foundation for more standardized and systematic research in imaging transcriptomics, and will help to advance future understanding of the influence of gene expression in the human brain.
Highlights
Technologies like magnetic resonance imaging (MRI) provide unique insights into macroscopic brain structure and function in vivo
We find moderate consistency in the statistical estimates generated by the pipelines, there are important differences
We conducted a comprehensive analysis examining whether and how different processing options modify statistical estimates derived from analyses using the Allen Human Brain Atlas (AHBA)
Summary
Technologies like magnetic resonance imaging (MRI) provide unique insights into macroscopic brain structure and function in vivo. Gene expression is useful as it is a fundamental molecular phenotype that can be plausibly linked to the function of biological pathways (Seidlitz et al 2018, Whitaker et al 2016), protein synthesis (Zheng et al 2019), receptor distributions (Beliveau et al 2017, Deco et al 2020, Nørgaard et al 2021, Preller et al 2018, Shine et al 2019), and cell types (Anderson et al 2020b, 2018, Gao et al 2020, Hansen et al 2021, Seidlitz et al 2020). There are numerous technical and analytic considerations, one foundational issue is that acquiring highquality transcriptomic data from the human brain is both costly and highly invasive, requiring budgets far greater than most typical neuroimaging studies and restrictive access to tissue from post-mortem donors or cranial surgical patients. Researchers must often rely on freely-available repositories of gene expression data
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.