Abstract

It is well established that GC content varies across the genome in many species and that GC biased gene conversion, one form of meiotic recombination, is likely to contribute to this heterogeneity. Bird genomes provide an extraordinary system to study the impact of GC biased gene conversion owed to their specific genomic features. They are characterised by a high karyotype conservation with substantial heterogeneity in chromosome sizes, with up to a dozen large macrochromosomes and many smaller microchromosomes common across all bird species. This heterogeneity in chromosome morphology is also reflected by other genomic features, such as smaller chromosomes being gene denser, more compact and more GC rich relative to their macrochromosomal counterparts - illustrating that the intensity of GC biased gene conversion varies across the genome. Here we study whether it is possible to infer heterogeneity in GC biased gene conversion rates across the genome using a recently published method that accounts for GC biased gene conversion when estimating branch lengths in a phylogenetic context. To infer the strength of GC biased gene conversion we contrast branch length estimates across the genome both taking and not taking non-stationary GC composition into account. Using simulations we show that this approach works well when GC fixation bias is strong and note that the number of substitutions along a branch is consistently overestimated when GC biased gene conversion is not accounted for. We use this predictable feature to infer the strength of GC dynamics across the great tit genome by applying our new test statistic to data at 4-fold degenerate sites from three bird species - great tit, zebra finch and chicken - three species that are among the best annotated bird genomes to date. We show that using a simple one-dimensional binning we fail to capture a signal of fixation bias as observed in our simulations. However, using a multidimensional binning strategy, we find evidence for heterogeneity in the strength of fixation bias, including AT fixation bias. This highlights the difficulties when combining sequence data across different regions in the genome.

Highlights

  • Estimating DNA sequence divergence between species is an important quantity in evolutionary analyses and population genetic approaches, such as for molecular dating, phylogeny reconstruction and the inference of selection

  • We find that binning according to current GC content, a frequently applied method (Bolivar et al, 2016; Corcoran et al, 2017), reveals little evidence for GC biased gene conversion across genes based on branch length estimations

  • We have shown using simulations that taking non-stationary GC content into account when estimating branch lengths it is necessary and possible to capture the impact of nucleotide fixation bias

Read more

Summary

Introduction

Estimating DNA sequence divergence between species is an important quantity in evolutionary analyses and population genetic approaches, such as for molecular dating, phylogeny reconstruction and the inference of selection. A particular example is the biased fixation probability that is caused by gene conversion of strong (G and C) over weak (A and T) base variants at heterozygous sites referred to as GC-biased gene conversion. It occurs during a repair induced gene conversion process that tends to preferably incorporate G/C nucleotides over A/T nucleotides during meiosis in many animal species (Duret and Galtier, 2009). It has been noted that accurately estimating sequence divergence can be difficult when GC content is not at equilibrium (Matsumoto et al, 2015)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.