AbstractBackgroundAnalysis of sex chromosomes is under‐reported in Alzheimer’s disease (AD) studies. As we gain more knowledge of disease associations and tissue‐specific regulatory effects of autosomal loci, there is a growing interest in building prediction models of gene expression in the brain to model the cumulative functions of non‐coding variants on the sex chromosomes. Transcriptome‐wide Association Studies (TWAS) can identify disease‐associated genes using these prediction models built for specific tissues by integrating genotype, imputed gene expression, and phenotype information. A key challenge of this approach for sex chromosome analysis is how to model copy number differences between males and females: genetic variants on the X chromosome are hemizygous for males, while females are heterozygous or homozygous diploid.MethodWe explore modeling approaches on the X chromosome using WGS and RNA‐seq data in 13 Brain sub‐regions retrieved from the GTEx project, currently restricting analyses to males. We used elastic‐net regression, with a balanced LASSO‐ridge penalty (mixing parameter α=0.5), as recommended by Wheeler et al. 2016. We tuned the penalization parameter λ for each gene in each tissue and assessed the model performance. There are 2392 genes on chromosome X, with only approximately 800 genes expressed in each GTEx brain tissue. SNPs (MAF>0.05) within the gene’s 1M bp flanking window were used to train the model. We used mean squared error to determine the penalization parameter and select the best models.ResultWe built 761 prediction models in 13 Brain sub‐regions on the expression of 503 genes. For example, we fit the prediction model of Gene COL4A6 (ENSG00000197565) in the Cortex, using 89 males. 83 cis‐SNPs were kept in the final model (λ=0.015). The model R2 is 0.20, suggesting that 20% of the variance of expression levels in males can be explained.ConclusionThese models predict the cis‐regulated gene expression on the X chromosome in males. We can then use these estimated gene expression weights to impute male expression levels to identify (putative) causal genes for AD with TWAS.
Read full abstract