Abstract

The relationship between DNA sequence, biochemical function, and molecular evolution is relatively well-described for protein-coding regions of genomes, but far less clear in noncoding regions, particularly, in eukaryote genomes. In part, this is because we lack a complete description of the essential noncoding elements in a eukaryote genome. To contribute to this challenge, we used saturating transposon mutagenesis to interrogate the Schizosaccharomyces pombe genome. We generated 31 million transposon insertions, a theoretical coverage of 2.4 insertions per genomic site. We applied a five-state hidden Markov model (HMM) to distinguish insertion-depleted regions from insertion biases. Both raw insertion-density and HMM-defined fitness estimates showed significant quantitative relationships to gene knockout fitness, genetic diversity, divergence, and expected functional regions based on transcription and gene annotations. Through several analyses, we conclude that transposon insertions produced fitness effects in 66–90% of the genome, including substantial portions of the noncoding regions. Based on the HMM, we estimate that 10% of the insertion depleted sites in the genome showed no signal of conservation between species and were weakly transcribed, demonstrating limitations of comparative genomics and transcriptomics to detect functional units. In this species, 3′- and 5′-untranslated regions were the most prominent insertion-depleted regions that were not represented in measures of constraint from comparative genomics. We conclude that the combination of transposon mutagenesis, evolutionary, and biochemical data can provide new insights into the relationship between genome function and molecular evolution.

Highlights

  • A goal of genetics is to understand what sequence elements within genomes specify cellular and organismal function

  • Genome-Wide Fitness Consequences of Insertions Our analysis showed that 100-nt windows with hidden Markov model (HMM) states state 1 (S1)/S2 are significantly more constrained within Schizosaccharomyces species, and feature less genetic diversity within S. pombe than regions with HMM states S3–state 5 (S5)

  • To examine further whether these insertion metrics contained quantitative information about gene disruption fitness, we compared these measures to the colony sizes of viable knockout mutants on solid media (Malecki and B€ahler 2016; Malecki et al 2016). This orthogonal measure of gene disruption fitness alteration uses solid media, a more direct fitness measure, and different methods to interfere with gene function. We found that both metrics were positively correlated with the colony size of knockout mutants

Read more

Summary

Introduction

A goal of genetics is to understand what sequence elements within genomes specify cellular and organismal function. The highly transcribed protein-coding regions of eukaryote genomes are routinely detected within genomes and are well studied. The numerous noncoding elements, on the other hand, are more challenging to detect, profile, and functionally describe. While biochemical assays of genome activity can indicate functional units, inferring function based solely on biochemical activity, for example, the ENCODE project’s definition of functional DNA (ENCODE Project Consortium et al 2012), is inconsistent with evolutionary analyses that show no signal of conservation for substantial proportions of larger eukaryotic genomes (Doolittle 2013; Graur et al 2013).

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call