Abstract
The diversity of B cell receptors provides a basis for recognizing numerous pathogens. Antibody repertoire sequencing has revealed relationships between B cell receptor sequences, their diversity, and their function in infection, vaccination, and disease. However, many repertoire datasets have been deposited without annotation or quality control, limiting their utility. To accelerate investigations of B cell immunoglobulin sequence repertoires and to facilitate development of algorithms for their analysis, we constructed a comprehensive public database of curated human B cell immunoglobulin sequence repertoires, cAb-Rep (https://cab-rep.c2b2.columbia.edu), which currently includes 306 immunoglobulin repertoires from 121 human donors, who were healthy, vaccinated, or had autoimmune disease. The database contains a total of 267.9 million V(D)J heavy chain and 72.9 million VJ light chain transcripts. These transcripts are full-length or near full-length, have been annotated with gene origin, antibody isotype, somatic hypermutations, and other biological characteristics, and are stored in FASTA format to facilitate their direct use by most current repertoire-analysis programs. We describe a website to search cAb-Rep for similar antibodies along with methods for analysis of the prevalence of antibodies with specific genetic signatures, for estimation of reproducibility of somatic hypermutation patterns of interest, and for delineating frequencies of somatically introduced N-glycosylation. cAb-Rep should be useful for investigating attributes of B cell sequence repertoires, for understanding characteristics of affinity maturation, and for identifying potential barriers to the elicitation of effective neutralizing antibodies in infection or by vaccination.
Highlights
B cells comprise a crucial component of the adaptive immune response [1]
We first assembled 376 B cell receptor (BCR) repertoire deep sequencing data sets from NCBI short reads archive (SRA) database, each sequenced by Illumina MiSeq or HiSeq and with library preparation protocols to cover full length V(D)J region [5′ primers at leader regions or 5′ Rapid amplification of cDNA ends (RACE), 3′ primers targeting constant region 1] or near full length (5′ primers at N-terminus of framework 1 region)
To understand the mechanisms of BCR diversity generated by V(D)J recombination and somatic hypermutation, high quality BCR repertoire datasets are critical
Summary
B cells comprise a crucial component of the adaptive immune response [1]. B cells recognize three-dimensional epitopes of antigens through the variable domains of the B cell receptor (BCR), or its various secreted forms of antibody. A comprehensive database of curated and well-annotated BCR transcripts, should accelerate repertoire studies including but not limited to characterization of B cell receptor diversity, mechanisms of clonal expansion, development of BCR repertoire analysis algorithms, estimation of frequency of antigen-specific antibodies and their precursorlike cells, and effects of SHM. Such a database should assist researchers in performing repertoire-related data mining
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.