Abstract

The B cells in our body generate protective antibodies by introducing somatic hypermutations (SHM) into the variable region of immunoglobulin genes (IgVs). The mutations are generated by activation induced deaminase (AID) that converts cytosine to uracil in single stranded DNA (ssDNA) generated during transcription. Attempts have been made to correlate SHM with ssDNA using bisulfite to chemically convert cytosines that are accessible in the intact chromatin of mutating B cells. These studies have been complicated by using different definitions of “bisulfite accessible regions” (BARs). Recently, deep-sequencing has provided much larger datasets of such regions but computational methods are needed to enable this analysis. Here we leveraged the deep-sequencing approach with unique molecular identifiers and developed a novel Hidden Markov Model based Bayesian Segmentation algorithm to characterize the ssDNA regions in the IGHV4-34 gene of the human Ramos B cell line. Combining hierarchical clustering and our new Bayesian model, we identified recurrent BARs in certain subregions of both top and bottom strands of this gene. Using this new system, the average size of BARs is about 15 bp. We also identified potential G-quadruplex DNA structures in this gene and found that the BARs co-locate with G-quadruplex structures in the opposite strand. Using various correlation analyses, there is not a direct site-to-site relationship between the bisulfite accessible ssDNA and all sites of SHM but most of the highly AID mutated sites are within 15 bp of a BAR. In summary, we developed a novel platform to study single stranded DNA in chromatin at a base pair resolution that reveals potential relationships among BARs, SHM and G-quadruplexes. This platform could be applied to genome wide studies in the future.

Highlights

  • High affinity antibodies that can neutralize viruses play a major role in protecting us from viral infections

  • A bisulfite assay, together with deep sequencing, was used to characterize the accessible single stranded DNA (ssDNA) that represents the substrate of activation induced deaminase (AID) in B cells

  • To deal with issues such as noise in the data, we developed a novel algorithm to more accurately identify bisulfite accessible ssDNA regions (BARs) and applied it to the IGHV4–34 immunoglobulin gene in a human B cell line

Read more

Summary

Introduction

High affinity antibodies that can neutralize viruses play a major role in protecting us from viral infections Such protective antibodies are often generated through the selective somatic hypermutation (SHM) of heavy and light chain antibody variable (V) region genes that encode the antigen binding sites in antibodies. AID induced mutations are largely restricted to the V region exon and to the switch regions that are located downstream and required for isotype switching This process of AID induced SHM requires a high level of transcription, which is presumably necessary in order to make the ssDNA substrate available [2, 4, 5]. There is increasing evidence that non-B forms of DNA and especially G-quadruplexes (G4) make ssDNA available to directly bind AID and play a role in targeting AID induced mutations to Ig switch regions, acting as a key mechanism in class switch recombination, and potentially in variable regions [9,10,11,12]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call