Abstract

BackgroundInterlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles.ResultsHere, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters.ConclusionsTaken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7 % of single nucleotide variants in duplicated regions of the human genome.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1681-3) contains supplementary material, which is available to authorized users.

Highlights

  • Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci

  • Using Single nucleotide polymorphism (SNP) calls from 1,058 low coverage whole genome sequences released by the 1000 Genomes Project and 48,931 global pairwise alignments between well-annotated paralogous sequences in the human reference genome, I identified 48,996 duplicated single nucleotide positions segregating identical alleles

  • My approach cannot polarize these SNPs into donor and acceptor sites, I can confidently deduce that one of the two constituent parallel SNPs arose as a consequence of the mutagenic action of IGC, not point mutation

Read more

Summary

Introduction

Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Segmental duplications (SDs) are among the most rapidly evolving and dynamic loci in the human genome [1, 2]. One subtle, yet ubiquitous, outcome of non-allelic homologous recombination is interlocus gene conversion (IGC), or the unidirectional transfer of sequence from one SD to a paralogous SD (Fig. 1a). In this manner, IGC functions as a “copy-and-paste” mechanism that imparts two characteristic signatures on the evolution of duplicated sequences.

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.