Abstract

BackgroundWhile the abundance of available sequenced genomes has led to many studies of regional heterogeneity in mutation rates, the co-variation among rates of different mutation types remains largely unexplored, hindering a deeper understanding of mutagenesis and genome dynamics. Here, utilizing primate and rodent genomic alignments, we apply two multivariate analysis techniques (principal components and canonical correlations) to investigate the structure of rate co-variation for four mutation types and simultaneously explore the associations with multiple genomic features at different genomic scales and phylogenetic distances.ResultsWe observe a consistent, largely linear co-variation among rates of nucleotide substitutions, small insertions and small deletions, with some non-linear associations detected among these rates on chromosome X and near autosomal telomeres. This co-variation appears to be shaped by a common set of genomic features, some previously investigated and some novel to this study (nuclear lamina binding sites, methylated non-CpG sites and nucleosome-free regions). Strong non-linear relationships are also detected among genomic features near the centromeres of large chromosomes. Microsatellite mutability co-varies with other mutation rates at finer scales, but not at 1 Mb, and shows varying degrees of association with genomic features at different scales.ConclusionsOur results allow us to speculate about the role of different molecular mechanisms, such as replication, recombination, repair and local chromatin environment, in mutagenesis. The software tools developed for our analyses are available through Galaxy, an open-source genomics portal, to facilitate the use of multivariate techniques in future large-scale genomics studies.

Highlights

  • While the abundance of available sequenced genomes has led to many studies of regional heterogeneity in mutation rates, the co-variation among rates of different mutation types remains largely unexplored, hindering a deeper understanding of mutagenesis and genome dynamics

  • The non-coding nonrepetitive (NCNR) subgenome was constructed by excluding genes and 5-kb flanking regions around them, other computationally predicted and/or experimentally validated functional elements, and all repeats identified by RepeatMasker [28]

  • The use of multivariate techniques was crucial to our investigation of mutation rate co-variation and its relationship with the genomic landscape, as it allowed us to consider several rates and several genomic features simultaneously

Read more

Summary

Introduction

While the abundance of available sequenced genomes has led to many studies of regional heterogeneity in mutation rates, the co-variation among rates of different mutation types remains largely unexplored, hindering a deeper understanding of mutagenesis and genome dynamics. The availability of a multitude of sequenced genomes and their alignments provides an opportunity to study mutations on a genome-wide scale in many species, including humans. The function of methylation in generating mutations at CpG locations has been extensively researched [2,6,8,9,10], no study to date has looked at the potential impact of the non-CpG methylome on the genome and its mutagenesis; in particular, methylated non-CpG cytosines may elevate mutation rates. Assessing the contribution of these three novel genomic features to mutation rate variation is of obvious and immediate interest

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call