Abstract

Few studies have investigated the causes of evolutionary rate variation among plant nuclear genes, especially in recently diverged species still capable of hybridizing in the wild. The recent advent of Next Generation Sequencing (NGS) permits investigation of genome wide rates of protein evolution and the role of selection in generating and maintaining divergence. Here, we use individual whole-transcriptome sequencing (RNAseq) to refine our understanding of the population genomics of wild species of sunflowers (Helianthus spp.) and the factors that affect rates of protein evolution. We aligned 35 GB of transcriptome sequencing data and identified 433,257 polymorphic sites (SNPs) in a reference transcriptome comprising 16,312 genes. Using SNP markers, we identified strong population clustering largely corresponding to the three species analyzed here (Helianthus annuus, H. petiolaris, H. debilis), with one distinct early generation hybrid. Then, we calculated the proportions of adaptive substitution fixed by selection (alpha) and identified gene ontology categories with elevated values of alpha. The “response to biotic stimulus” category had the highest mean alpha across the three interspecific comparisons, implying that natural selection imposed by other organisms plays an important role in driving protein evolution in wild sunflowers. Finally, we examined the relationship between protein evolution (dN/dS ratio) and several genomic factors predicted to co-vary with protein evolution (gene expression level, divergence and specificity, genetic divergence [FST], and nucleotide diversity pi). We find that variation in rates of protein divergence was correlated with gene expression level and specificity, consistent with results from a broad range of taxa and timescales. This would in turn imply that these factors govern protein evolution both at a microevolutionary and macroevolutionary timescale. Our results contribute to a general understanding of the determinants of rates of protein evolution and the impact of selection on patterns of polymorphism and divergence.

Highlights

  • Achieving a better understanding of the factors that shape patterns of divergence across genes, a central aim in evolutionary genetics, should become increasingly straightforward as the amount of sequencing data available grows exponentially [1±3]

  • We generated a total of 122 GB (122 × 109 basepairs or 609 million paired-end reads) of sequencing data from RNA extracted from the 29 seedlings representing three Helianthus taxa: H. annuus, H. debilis, and H. petiolaris (Figure 1 and Tables 1 and 2, summary of sequencing statistics)

  • While likely to revolutionize our understanding of transcriptome evolution, especially in non-model systems, a possible caveat is that sources of bias in coverage and gene expression estimates are not yet fully understood

Read more

Summary

Introduction

Achieving a better understanding of the factors that shape patterns of divergence across genes, a central aim in evolutionary genetics, should become increasingly straightforward as the amount of sequencing data available grows exponentially [1±3]. A robust and simple way to quantify rates of gene evolution at the protein levels comes from dN/dS ratio measurements [4]. This approach quantifies selection pressures by comparing the rate of substitutions at silent sites (dS), which are presumed neutral, to the rate of substitutions at non-silent sites (dN, amino acid changes), which may experience selection. A number of genomic parameters have been shown to correlate with rates of protein coding evolution in model organisms, including gene expression level and specificity, gene length, recombination rate, and mutation rate. A general conclusion is that gene expression level, specificity and essentiality account for a substantial amount of variation in rates of protein evolution due to selective constraints imposed by these genomic parameters [1,5±8]. While early studies of a limited number of genes [9,10] did identify a positive correlation between the rate of protein evolution and expression divergence, an absence of such correlations is more commonly reported (e.g., [11,12])

Objectives
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call