Abstract

Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP). The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics across populations from the broadest points of human migration history yet sampled.

Highlights

  • A central challenge in modern medical and population genomics is identifying trait-disposing genetic variants, interpreting their molecular consequences, and determining the transferability of their functional roles across individuals and populations

  • By leveraging the high resolution of RNA sequencing and broad population sampling, we survey the landscape of transcriptome variation across a globally distributed set of seven populations that span a breadth of human genetic variation and major dispersal events

  • We find only 44 transcripts that show significant differences in expression, likely as a result of the small sample size, but we find that 25% of the variance in gene expression is due to population differences

Read more

Summary

Introduction

A central challenge in modern medical and population genomics is identifying trait-disposing genetic variants, interpreting their molecular consequences, and determining the transferability of their functional roles across individuals and populations. We build upon previous work using transcript abundance and splicing as model systems for understanding how population substructure can impact the genetic architecture of biomedical traits [1,2,3,4]. Genetic studies of microsatellites panels and single nucleotide polymorphisms (SNPs) have shown a decrease in genetic diversity as a function of a population’s geographic distance from eastern or southern Africa [5,6,7]. This pattern fits a serial founder effect model, but it remains unclear whether transcriptome variation follows this pattern and how closely genetic effects on regulation mirror human migration history.

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.