Abstract
Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our knowledge of variations in the genomic loci encoding immunoglobulin genes is incomplete, resulting in conflicting VDJ gene assignments and biased genotype and haplotype inference. Haplotypes can be inferred using IGHJ6 heterozygosity, observed in one third of the people. Here, we propose a robust novel method for determining VDJ haplotypes by adapting a Bayesian framework. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population. To test this method, we generated a multi-individual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. The inferred haplotypes may have clinical implications for genetic disease predispositions. Our findings expand the knowledge that can be extracted from antibody repertoire sequencing data.
Highlights
Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses
The heavy chains are assembled by a complex process involving somatic recombination of a large number of germlineencoded IGHV, IGHD, and IGHJ genes, along with junctional diversity that is added at the boundaries where these genes are joined together[1]
In agreement with previous studies[23], genotyping resulted in a five-fold reduction in multiple assignments of a sequence for V genes, and a two-fold reduction for D genes. This reduction was observed by genotyping sequences that were aligned using three different tools: IgBLAST29, IMGT HighV-QUEST30, and partis[25] (Supplementary Fig. 1A). ~2% of sequences were initially assigned to genes that were removed during genotyping
Summary
Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population To test this method, we generated a multiindividual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. Correct assignment of antibody sequences to specific germline V, D, and J genes is a critical step in AIRR-seq analysis. It is the basis for identifying somatic hypermutation, pairing biases, N additions and exonuclease removals, determination of gene usage distribution, and studying the link between AIRR-seq data and clinical conditions. Because of the difficulty in performing physical sequencing of these loci, several computational tools have been developed for personal genotype inference from AIRR-seq data[3,23,24,25]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.