Signal, bias, and the role of transcriptome assembly quality in phylogenomic inference

Jennifer L Spillane,Matthew D Macmanes,David C Plachetzki,Troy M Lapolice

doi:10.1186/s12862-021-01772-2

Jennifer L Spillane, Matthew D Macmanes + Show 2 more

Open Access

https://doi.org/10.1186/s12862-021-01772-2

Copy DOI

Journal: BMC Ecology and Evolution	Publication Date: Mar 16, 2021
Citations: 8	License type: open-access

Affiliation: University of New Hampshire

Abstract

BackgroundPhylogenomic approaches have great power to reconstruct evolutionary histories, however they rely on multi-step processes in which each stage has the potential to affect the accuracy of the final result. Many studies have empirically tested and established methodology for resolving robust phylogenies, including selecting appropriate evolutionary models, identifying orthologs, or isolating partitions with strong phylogenetic signal. However, few have investigated errors that may be initiated at earlier stages of the analysis. Biases introduced during the generation of the phylogenomic dataset itself could produce downstream effects on analyses of evolutionary history. Transcriptomes are widely used in phylogenomics studies, though there is little understanding of how a poor-quality assembly of these datasets could impact the accuracy of phylogenomic hypotheses. Here we examined how transcriptome assembly quality affects phylogenomic inferences by creating independent datasets from the same input data representing high-quality and low-quality transcriptome assembly outcomes.ResultsBy studying the performance of phylogenomic datasets derived from alternative high- and low-quality assembly inputs in a controlled experiment, we show that high-quality transcriptomes produce richer phylogenomic datasets with a greater number of unique partitions than low-quality assemblies. High-quality assemblies also give rise to partitions that have lower alignment ambiguity and less compositional bias. In addition, high-quality partitions hold stronger phylogenetic signal than their low-quality transcriptome assembly counterparts in both concatenation- and coalescent-based analyses.ConclusionsOur findings demonstrate the importance of transcriptome assembly quality in phylogenomic analyses and suggest that a portion of the uncertainty observed in such studies could be alleviated at the assembly stage.

Highlights

Phylogenomic approaches have great power to reconstruct evolutionary histories, they rely on multi-step processes in which each stage has the potential to affect the accuracy of the final result
We find that high-quality transcriptomes produce larger phylogenomic datasets with partitions that have less alignment ambiguity, weaker compositional bias, and are more concordant with the constraint tree, in both concatenation- and coalescent-based analyses, than datasets derived from low-quality transcriptome assemblies
We prepared one high-quality dataset and one low-quality dataset from the same read sets using the Oyster River Protocol (ORP) [32], an assembly pipeline that creates five different transcriptome assemblies for each raw RNA-seq dataset, calculates quality scores for each one, and produces a merged transcriptome assembly consisting of the highest quality unique transcripts (Fig. 1)

Summary

Introduction

Phylogenomic approaches have great power to reconstruct evolutionary histories, they rely on multi-step processes in which each stage has the potential to affect the accuracy of the final result. The generation of a phylogenomic data matrix is a complex and critical process, as biases introduced at this point can propagate in downstream analyses in unpredictable ways. Researchers have added the additional step of recoding the amino acid data matrix in an attempt to account for saturation and compositional heterogeneity ([16, 22,23,24], see [25]). While each of these issues is critical to consider in phylogenomic studies, collectively they deal with aspects of the analyses that occur after transcriptome datasets have been assembled. Biases introduced during the generation of the primary transcriptome assemblies are not explicitly addressed and may persist in influencing downstream inferences

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Signal, bias, and the role of transcriptome assembly quality in phylogenomic inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Ecology and Evolution

Lead the way for us

Similar Papers

Bridging gaps in demographic analysis with phylogenetic imputation.
Tamora D James ... Dylan Z Childs
Conservation Biology | VOL. 35
Tamora D James, et. al.Tamora D James ... Dylan Z Childs
21 Jan 2021
Conservation Biology | VOL. 35

Phylogenomics of Lophotrochozoa with Consideration of Systematic Error.
Kevin M Kocot ... Damien S Waits
Systematic Biology | VOL. 66
Kevin M Kocot, et. al.Kevin M Kocot ... Damien S Waits
23 Sep 2016
Systematic Biology | VOL. 66

Selecting Question-Specific Genes to Reduce Incongruence in Phylogenomics: A Case Study of Jawed Vertebrate Backbone Phylogeny.
Meng-Yun Chen ... Dan Liang
Systematic Biology | VOL. 64
Meng-Yun Chen, et. al.Meng-Yun Chen ... Dan Liang
13 Aug 2015
Systematic Biology | VOL. 64

Transcriptome sequencing reveals signatures of positive selection in the Spot-Tailed Earless Lizard.
Jose A Maldonado ... Matthew K Fujita
PloS one | VOL. 15
Jose A Maldonado, et. al.Jose A Maldonado ... Matthew K Fujita
15 Jun 2020
PloS one | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Signal, bias, and the role of transcriptome assembly quality in phylogenomic inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Ecology and Evolution