Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence

Ryan Lusk,Boris Tabakoff,Katerina Kechris,Farnoush Banaei-Kashani,Laura M Saba,Evan Stene

doi:10.1038/s41467-021-21894-x

Abstract

Annotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3′-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model—trained using the Human Brain Reference RNA commercial standard—performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi’s input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Mar 12, 2021
Citations: 22	License type: open-access

R Discovery Prime

R Discovery Prime

Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence

Abstract

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

Abstract 5791: Microfluidics-based library preparation strategies for targeted high-throughput sequencing and RNA sequencing in hematological malignancies in clinical samples
Cédric Pastoret ... Amyra Aliouat
Cancer Research | VOL. 82
Cédric Pastoret, et. al.Cédric Pastoret ... Amyra Aliouat
15 Jun 2022
Cancer Research | VOL. 82

DNA and RNA next generation sequencing for personalizing cancer treatment: A single-center experience.
Joseba Rebollo ... Antonio Brugarolas
Journal of Clinical Oncology | VOL. 38
Joseba Rebollo, et. al.Joseba Rebollo ... Antonio Brugarolas
20 May 2020
Journal of Clinical Oncology | VOL. 38

MountainClimber Identifies Alternative Transcription Start and Polyadenylation Sites in RNA-Seq.
Ashley A Cass ... Xinshu Xiao
Cell Systems | VOL. 9
Ashley A Cass, et. al.Ashley A Cass ... Xinshu Xiao
18 Sep 2019
Cell Systems | VOL. 9

Abstract 4679: Novel pathway mutations in malignant mesothelioma revealed by high-throughput DNA and RNA sequencing
Akihiko Miyanaga ... Hisao Asamura
Cancer Research | VOL. 74
Akihiko Miyanaga, et. al.Akihiko Miyanaga ... Hisao Asamura
30 Sep 2014
Abstract 4679: Novel pathway mutations in malignant mesothelioma revealed by high-throughput DNA and RNA sequencing
Akihiko Miyanaga ... Hisao Asamura

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence

Abstract

Talk to us

Similar Papers

More From: Nature Communications