LSTrAP-denovo: Automated Generation of Transcriptome Atlases for Eukaryotic Species Without Genomes.

Peng Ken Lim,Ruoxi Wang,Marek Mutwil

doi:10.1111/ppl.14407

Abstract

Despite the abundance of species with transcriptomic data, a significant number of species still lack sequenced genomes, making it difficult to study gene function and expression in these organisms. While de novo transcriptome assembly can be used to assemble protein-coding transcripts from RNA-sequencing (RNA-seq) data, the datasets used often only feature samples of arbitrarily selected or similar experimental conditions, which might fail to capture condition-specific transcripts. We developed the Large-Scale Transcriptome Assembly Pipeline for de novo assembled transcripts (LSTrAP-denovo) to automatically generate transcriptome atlases of eukaryotic species. Specifically, given an NCBI TaxID, LSTrAP-denovo can (1) filter undesirable RNA-seq accessions based on read data, (2) select RNA-seq accessions via unsupervised machine learning to construct a sample-balanced dataset for download, (3) assemble transcripts via over-assembly, (4) functionally annotate coding sequences (CDS) from assembled transcripts and (5) generate transcriptome atlases in the form of expression matrices for downstream transcriptomic analyses. LSTrAP-denovo is easy to implement, written in Python, and is freely available at https://github.com/pengkenlim/LSTrAP-denovo/.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

LSTrAP-denovo: Automated Generation of Transcriptome Atlases for Eukaryotic Species Without Genomes.

Abstract

Talk to us

Similar Papers

More From: Physiologia plantarum

Lead the way for us

Similar Papers

Performance evaluation of lossy quality compression algorithms for RNA-seq data
Rongshan Yu ... Wenxian Yang
BMC Bioinformatics | VOL. 21
Rongshan Yu, et. al.Rongshan Yu ... Wenxian Yang
20 Jul 2020
BMC Bioinformatics | VOL. 21

Dynamic epi-transcriptomic landscape mapping with disease progression in estrogen receptor-positive breast cancer.
Stephen Keelan ... Ben Doherty
Cancer Communications | VOL. 43
Stephen Keelan, et. al.Stephen Keelan ... Ben Doherty
20 Jan 2023
Cancer Communications | VOL. 43

Transcriptome of Xenopus andrei, an octoploid frog, during embryonic development
Mark E Pownall ... Margaret S Saha
Data in Brief | VOL. 19
Mark E Pownall, et. al.Mark E Pownall ... Margaret S Saha
10 May 2018
Data in Brief | VOL. 19

The importance of genomic predictors for clinical outcome of hematological malignancies
Cunte Chen ... Chengwu Zeng
Blood Science | VOL. 3
Cunte Chen, et. al.Cunte Chen ... Chengwu Zeng
01 Jul 2021
Blood Science | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LSTrAP-denovo: Automated Generation of Transcriptome Atlases for Eukaryotic Species Without Genomes.

Abstract

Talk to us

Similar Papers

More From: Physiologia plantarum