FunPat: function-based pattern analysis on RNA-seq time series data.

Tiziana Sanavia,Barbara Di Camillo,Francesca Finotello

doi:10.1186/1471-2164-16-s6-s2

Abstract

BackgroundDynamic expression data, nowadays obtained using high-throughput RNA sequencing, are essential to monitor transient gene expression changes and to study the dynamics of their transcriptional activity in the cell or response to stimuli. Several methods for data selection, clustering and functional analysis are available; however, these steps are usually performed independently, without exploiting and integrating the information derived from each step of the analysis.MethodsHere we present FunPat, an R package for time series RNA sequencing data that integrates gene selection, clustering and functional annotation into a single framework. FunPat exploits functional annotations by performing for each functional term, e.g. a Gene Ontology term, an integrated selection-clustering analysis to select differentially expressed genes that share, besides annotation, a common dynamic expression profile.ResultsFunPat performance was assessed on both simulated and real data. With respect to a stand-alone selection step, the integration of the clustering step is able to improve the recall without altering the false discovery rate. FunPat also shows high precision and recall in detecting the correct temporal expression patterns; in particular, the recall is significantly higher than hierarchical, k-means and a model-based clustering approach specifically designed for RNA sequencing data. Moreover, when biological replicates are missing, FunPat is able to provide reproducible lists of significant genes. The application to real time series expression data shows the ability of FunPat to select differentially expressed genes with high reproducibility, indirectly confirming high precision and recall in gene selection. Moreover, the expression patterns obtained as output allow an easy interpretation of the results.ConclusionsA novel analysis pipeline was developed to search the main temporal patterns in classes of genes similarly annotated, improving the sensitivity of gene selection by integrating the statistical evidence of differential expression with the information on temporal profiles and the functional annotations. Significant genes are associated to both the most informative functional terms, avoiding redundancy of information, and the most representative temporal patterns, thus improving the readability of the results. FunPat package is provided in R/Bioconductor at link: http://sysbiobig.dei.unipd.it/?q=node/79.

Highlights

Dynamic expression data, nowadays obtained using high-throughput RNA sequencing, are essential to monitor transient gene expression changes and to study the dynamics of their transcriptional activity in the cell or response to stimuli
FunPat pipeline FunPat takes as input the expression data and the functional annotations organized according to Gene Sets, which can be defined as Gene Ontology (GO) terms, pathways or other sets depending on the annotation database used as input
We considered the stand-alone application of the BoundedArea method in order to evaluate if the integration of gene selection with the clustering step and the functional annotation is able to improve the recall without loss in precision

Summary

Introduction

Nowadays obtained using high-throughput RNA sequencing, are essential to monitor transient gene expression changes and to study the dynamics of their transcriptional activity in the cell or response to stimuli. Gene expression regulation is an intrinsically dynamic phenomenon, whose characteristics can be investigated using dynamic expression data In this context, microarray technique, RNA-seq avoids the design of specific probes, enabling a higher number of transcripts to be measured on a wider dynamic range. Among others, Trimmed Mean of M-values (TMM) [6] provides scaling factors to correct the library sizes calculated as a weighted mean of log ratios after filtering out the most expressed genes and the genes with the largest log ratios This approach has been recently shown to prevent loss of statistical power in the analysis of RNA-seq data when high-count genes are present [7]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Jun 1, 2015
Citations: 83	License type: cc-by

R Discovery Prime

R Discovery Prime

FunPat: function-based pattern analysis on RNA-seq time series data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Decision letter: Applying causal discovery to single-cell analyses using CausalCell
Babak Momeni ... Anna Akhmanova
-
Babak Momeni, et. al.Babak Momeni ... Anna Akhmanova
14 Aug 2022
14 Aug 2022

Author response: Applying causal discovery to single-cell analyses using CausalCell
Jielong Huang ... Yanqing Ding
-
Jielong Huang, et. al.Jielong Huang ... Yanqing Ding
23 Aug 2022
23 Aug 2022

Author response: Negative regulation of ABA signaling by WRKY33 is critical for Arabidopsis immunity towards Botrytis cinerea 2100
Shouan Liu ... Jörg Ziegler
-
Shouan Liu, et. al.Shouan Liu ... Jörg Ziegler
03 Jun 2015
03 Jun 2015

A race through the maze of genomic evidence
Timothy R Hughes ... Frederick P Roth
Genome Biology | VOL. 9
Timothy R Hughes, et. al.Timothy R Hughes ... Frederick P Roth
01 Jun 2008
Genome Biology | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FunPat: function-based pattern analysis on RNA-seq time series data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics