Abstract

Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5’-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5’-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions.

Highlights

  • Immunoglobulins play a vital role in recognition of pathogens, thereby enabling their removal or modification of their activities or functions

  • In order to address the incomplete representation of 5’-untranslated region (5’UTR) and leader sequences of antibody genes in the IMGT database [6], we have examined such sequences in a publicly available antibody transcript data set of 98 individuals [19], analyzed for the same purpose in the study by Mikocziova et al [18]

  • Using a strict pre-processing and filtering pipeline followed by extraction of consensus 5’UTR-leader sequences (Figure 1), we identified 166 sequences, found in frequencies ranging from 1 individual to 98 individuals (Figure 2; Supplementary Table 1 and Supplementary Data 2)

Read more

Summary

Introduction

Immunoglobulins play a vital role in recognition of pathogens, thereby enabling their removal or modification of their activities or functions. The typical antibody consists of two identical heavy (H) chains and two identical light chains, of which the H chain often plays a dominant role in determination of specificity [1]. It has been possible to approach features of AIRR at a personalized germline gene level as a key factor in the nature of developing immune responses [2]. The importance of the personal germline gene repertoire for the development of specific antibodies may be substantial, in particular in view of the importance of stereotyped (public) immune responses against a number of antigens [3]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call