Abstract

Lymphoblastoid cell lines (LCLs) have been critical to establishing genetic resources for biomedical science. They have been used extensively to study human genetic diversity, genome function, and inform the development of tools and methodologies for augmenting disease genetics research. While the validity of variant callsets from LCLs has been demonstrated for most of the genome, previous work has shown that DNA extracted from LCLs is modified by V(D)J recombination within the immunoglobulin (IG) loci, regions that harbor antibody genes critical to immune system function. However, the impacts of V(D)J on short read sequencing data generated from LCLs has not been extensively investigated. In this study, we used LCL-derived short read sequencing data from the 1000 Genomes Project (n = 2,504) to identify signatures of V(D)J recombination. Our analyses revealed sample-level impacts of V(D)J recombination that varied depending on the degree of inferred monoclonality. We showed that V(D)J associated somatic deletions impacted genotyping accuracy, leading to adulterated population-level estimates of allele frequency and linkage disequilibrium. These findings illuminate limitations of using LCLs and short read data for building genetic resources in the IG loci, with implications for interpreting previous disease association studies in these regions.

Highlights

  • Lymphoblastoid cell lines (LCL) are generated by infecting B cells with the Epstein Barr Virus (EBV) [1] to create immortalized cell lines

  • We reasoned that the presence of V(D)J recombination events would result in larger insert sizes, and that these would be enriched within IGH

  • We found that the mean percentage of heterozygous variants telomeric of the IGHV gene segment used for V(D)J recombination was 3.4 fold higher than the mean percentage of heterozygous variants centromeric of the IGHV gene segment (P = 0.003, two-sided paired Wilcoxon test; Fig 4A)

Read more

Summary

Introduction

Lymphoblastoid cell lines (LCL) are generated by infecting B cells with the Epstein Barr Virus (EBV) [1] to create immortalized cell lines. Limitations of LCL for IG reference datasets sought to evaluate the extent of sample-level V(D)J recombination in LCL-derived short read sequencing data from the 1KGP, and assess downstream impacts of these somatic events.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call