Abstract

Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our knowledge of variations in the genomic loci encoding immunoglobulin genes is incomplete, resulting in conflicting VDJ gene assignments and biased genotype and haplotype inference. Haplotypes can be inferred using IGHJ6 heterozygosity, observed in one third of the people. Here, we propose a robust novel method for determining VDJ haplotypes by adapting a Bayesian framework. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population. To test this method, we generated a multi-individual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. The inferred haplotypes may have clinical implications for genetic disease predispositions. Our findings expand the knowledge that can be extracted from antibody repertoire sequencing data.

Highlights

  • Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses

  • The heavy chains are assembled by a complex process involving somatic recombination of a large number of germlineencoded IGHV, IGHD, and IGHJ genes, along with junctional diversity that is added at the boundaries where these genes are joined together[1]

  • In agreement with previous studies[23], genotyping resulted in a five-fold reduction in multiple assignments of a sequence for V genes, and a two-fold reduction for D genes. This reduction was observed by genotyping sequences that were aligned using three different tools: IgBLAST29, IMGT HighV-QUEST30, and partis[25] (Supplementary Fig. 1A). ~2% of sequences were initially assigned to genes that were removed during genotyping

Read more

Summary

Introduction

Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population To test this method, we generated a multiindividual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. Correct assignment of antibody sequences to specific germline V, D, and J genes is a critical step in AIRR-seq analysis. It is the basis for identifying somatic hypermutation, pairing biases, N additions and exonuclease removals, determination of gene usage distribution, and studying the link between AIRR-seq data and clinical conditions. Because of the difficulty in performing physical sequencing of these loci, several computational tools have been developed for personal genotype inference from AIRR-seq data[3,23,24,25]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call