Abstract

Germline variations in immunoglobulin genes influence the repertoire of B cell receptors and antibodies, and such polymorphisms may impact disease susceptibility. However, the knowledge of the genomic variation of the immunoglobulin loci is scarce. Here, we report 25 potential novel germline IGHV alleles as inferred from rearranged naïve B cell cDNA repertoires of 98 individuals. Thirteen novel alleles were selected for validation, out of which ten were successfully confirmed by targeted amplification and Sanger sequencing of non-B cell DNA. Moreover, we detected a high degree of variability upstream of the V-REGION in the 5′UTR, L-PART1 and L-PART2 sequences, and found that identical V-REGION alleles can differ in upstream sequences. Thus, we have identified a large genetic variation not only in the V-REGION but also in the upstream sequences of IGHV genes. Our findings provide a new perspective for annotating immunoglobulin repertoire sequencing data.

Highlights

  • Immunoglobulins are an important part of the adaptive immune system

  • We are the first to provide a comprehensive overview of the heavy chain upstream (5 UTR, L-PART1, and L-PART2) sequence variants in an AIRR-seq dataset

  • We managed to validate a number of novel alleles by targeted amplification of genomic DNA of the same individuals

Read more

Summary

Introduction

Immunoglobulins are an important part of the adaptive immune system. They exert their function either as the antigen receptor of B cells that is essential for the antigen presentation capacity of these cells [1], or as secreted antibodies that survey extracellular fluids of the body. The genes of the heavy chain are located on chromosome 14 (14q32.33) [3], while the light chain genes are present on two separate loci, kappa and lambda, which are located on chromosome 2 (2p11.2) and chromosome 22 (22q11.2) respectively [4] These loci remain incompletely characterized due to the fact that they contain many repetitive sequence segments with many duplicated genes [5], which makes it difficult to correctly assemble short reads from whole genome sequencing. To this date, a limited number of genomically sequenced [6,7,8] and inferred [9,10] haplotypes of the heavy chain and the two light chain loci have been described. Different databases exist for genomic immune receptor DNA sequences (IMGT/GENE-DB [11]), putative novel variants from inferred data (IgPdb, https://cgi.cse.unsw.edu.au/ ~ihmmune/IgPdb/information.php) or entire immune receptor repertoires (OGRDB [12])

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call