Abstract

BackgroundSalmonid fishes exhibit high levels of phenotypic and ecological variation and are thus ideal model systems for studying evolutionary processes of adaptive divergence and speciation. Furthermore, salmonids are of major interest in fisheries, aquaculture, and conservation research. Improving understanding of the genetic mechanisms underlying traits in these species would significantly progress research in these fields. Here we generate high quality de novo transcriptomes for four salmonid species: Atlantic salmon (Salmo salar), brown trout (Salmo trutta), Arctic charr (Salvelinus alpinus), and European whitefish (Coregonus lavaretus). All species except Atlantic salmon have no reference genome publicly available and few if any genomic studies to date.ResultsWe used paired-end RNA-seq on Illumina to generate high coverage sequencing of multiple individuals, yielding between 180 and 210 M reads per species. After initial assembly, strict filtering was used to remove duplicated, redundant, and low confidence transcripts. The final assemblies consisted of 36,505 protein-coding transcripts for Atlantic salmon, 35,736 for brown trout, 33,126 for Arctic charr, and 33,697 for European whitefish and are made publicly available. Assembly completeness was assessed using three approaches, all of which supported high quality of the assemblies: 1) ~78% of Actinopterygian single-copy orthologs were successfully captured in our assemblies, 2) orthogroup inference identified high overlap in the protein sequences present across all four species (40% shared across all four and 84% shared by at least two), and 3) comparison with the published Atlantic salmon genome suggests that our assemblies represent well covered (~98%) protein-coding transcriptomes. Thorough comparison of the generated assemblies found that 84-90% of transcripts in each assembly were orthologous with at least one of the other three species. We also identified 34-37% of transcripts in each assembly as paralogs. We further compare completeness and annotation statistics of our new assemblies to available related species.ConclusionNew, high-confidence protein-coding transcriptomes were generated for four ecologically and economically important species of salmonids. This offers a high quality pipeline for such complex genomes, represents a valuable contribution to the existing genomic resources for these species and provides robust tools for future investigation of gene expression and sequence evolution in these and other salmonid species.

Highlights

  • Salmonid fishes exhibit high levels of phenotypic and ecological variation and are ideal model systems for studying evolutionary processes of adaptive divergence and speciation

  • Atlantic salmon were collected from an anadromous river running population on the river Blackwater, brown trout were third-generation hatchery trout from Houietoun Hatchery (Stirling, Scotland), Arctic charr were wild caught from a generalist freshwater population in Loch Clair (North-west Scotland), and European whitefish were wild caught from the generalist freshwater population at Loch Lomond

  • To further assess the completeness and utility of the resources presented here, we examined how successfully Benchmarking set of Universal Single-Copy Orthologs (BUSCO) were recovered in our assemblies compared to the National Centre for Biotechnology Information (NCBI) protein dataset for Atlantic salmon (GCF_000233375.1) (48,602 transcripts; based on retaining only the longest isoform per gene), as well as against the PhyloFish brown trout and European whitefish assemblies (75,388 and 74,701 transcripts respectively) [35]

Read more

Summary

Introduction

Salmonid fishes exhibit high levels of phenotypic and ecological variation and are ideal model systems for studying evolutionary processes of adaptive divergence and speciation. Salmonids exhibit exceedingly high levels of diversity in their life histories, behaviour, morphology and physiology, with patterns of trait variation often replicated within and across species, as well as across different freshwater systems [3,4,5,6,7] This makes salmonids interesting in the context of fundamental and applied research on intra- and inter-specific diversity in morphology, physiology and ecology. Several important resources have been established through the efforts of consortia such as cGRASP (Consortium for Genomic Research on All Salmonids Program, http://www.sfu.ca/cgrasp/index.html), ICSASG (International Collaboration to Sequence the Atlantic Salmon Genome), and SalmonDB (http://salmondb.cm m.uchile.cl). These include expressed sequence tag (EST) databases, microarray gene expression platforms, and SNP arrays. Consortia efforts have generated extensive EST databases for Atlantic salmon (Salmo salar) and rainbow trout (Oncorhynchus mykiss) [13,14,15,16,17,18], as well as on a smaller scale for other salmonid species such as chinook salmon (Oncorhynchus tshawytscha), sockeye salmon (Oncorhynchus nerka) and lake whitefish (Coregonus clupeaformis) [13]. cGRASP have generated dense microarray (44 K oligo array) and SNP-chip (~130 K) platforms for Atlantic salmon [19,20,21,22,23]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call