Abstract

The immunoglobulin heavy variable (IGHV) and T cell beta variable (TRBV) loci are among the most complex and variable regions in the human genome. Generated through a process of gene duplication/deletion and diversification, these loci can vary extensively between individuals in copy number and contain genes that are highly similar, making their analysis technically challenging. Here, we present a comprehensive study of the functional gene segments in the IGHV and TRBV loci, quantifying their copy number and single-nucleotide variation in a globally diverse sample of 109 (IGHV) and 286 (TRBV) humans from over a 100 populations. We find that the IGHV and TRBV gene families exhibit starkly different patterns of variation. In addition to providing insight into the different evolutionary paths of the IGHV and TRBV loci, our results are also important to the adaptive immune repertoire sequencing community, where the lack of frequencies of common alleles and copy number variants is hampering existing analytical pipelines.

Highlights

  • By some estimates, genomic variation due to copy number differences underlies more variation in the human genome than that due to single-nucleotide differences (Tuzun et al, 2005; Sudmant et al, 2015)

  • Two gene families that are of particular biomedical relevance but for which variation is not well characterized are the immunoglobulin heavy variable (IGHV) family, a 1-Mb locus located on chromosome 14 (Matsuda et al, 1998; Watson et al, 2013), and the T-cell receptor beta variable (TRBV) family, a 500kb locus located on chromosome 7 (Rowen et al, 1996)

  • We found 5 single-nucleotide variant (SNV) in the 11 two-copy IGHV gene segments that are private to a single geographic region and 14 such variants in the 40 two-copy TRBV gene segments (Table S4)

Read more

Summary

Introduction

Genomic variation due to copy number differences underlies more variation in the human genome than that due to single-nucleotide differences (Tuzun et al, 2005; Sudmant et al, 2015). Both loci are organized as a series of approximately 45 functional V gene segments and are adjacent to a collection of D (diversity) and J (joining) segments Both loci are present in the genomes of all vertebrates known to have an adaptive immune system, the arrangement of the IGHV locus can differ between species (Cannon et al, 2004; Das et al, 2008; Flajnik & Kasahara, 2010). The genes comprising the IGHV and TRBV loci are distant paralogs and are believed to derive from a common ancestral locus in a vertebrate contemporaneous with or predating jawed fishes (Cannon et al, 2004; Das et al, 2008; Flajnik & Kasahara, 2010) That these two loci share genomic features and evolutionary origins makes them an ideal system for a comparative study in gene family evolution

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call