GPrank: an R package for detecting dynamic elements from genome-wide time series

Hande Topa,Antti Honkela

doi:10.1186/s12859-018-2370-4

Abstract

BackgroundGenome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis.ResultsHere we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified.ConclusionsIncorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes.

Highlights

Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time
GPrank includes examples on mean and variance modelling in RNA sequencing (RNA-seq) and Pool-seq data, it is flexible to be used with any kind of HTS data by allowing users to first apply their own method to estimate the mean and variance information by choosing the most suitable method based on the characteristics of their data and their expertise
It is worth mentioning that our method currently models time series of each genomic element independently of the time series of other genomic elements in the data set, which might lead to information loss

Summary

Results

We present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are ranked by their BFs, and temporally most dynamic elements can be identified

Conclusions

Background

Results and discussion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Oct 4, 2018
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

GPrank: an R package for detecting dynamic elements from genome-wide time series

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Gaussian process test for high-throughput sequencing time series: application to experimental evolution
Hande Topa ... Antti Honkela
Computer applications in the biosciences : CABIOS | VOL. 31
Hande Topa, et. al.Hande Topa ... Antti Honkela
21 Jan 2015
Computer applications in the biosciences : CABIOS | VOL. 31

Physics-Aware Gaussian Processes for Earth Observation
Gustau Camps-Valls ... Valero Laparra
-
Gustau Camps-Valls, et. al.Gustau Camps-Valls ... Valero Laparra
01 Jan 2017
01 Jan 2017

Physics-aware Gaussian processes in remote sensing
Gustau Camps-Valls ... Francisco Javier García-Haro
Applied Soft Computing Journal | VOL. 68
Gustau Camps-Valls, et. al.Gustau Camps-Valls ... Francisco Javier García-Haro
22 Mar 2018
Applied Soft Computing Journal | VOL. 68

Advances in Gaussian Processes for Earth Sciences: Physics-aware, interpretability and consistency
Gustau Camps-Valls ... Daniel Svendsen
-
Gustau Camps-Valls, et. al.Gustau Camps-Valls ... Daniel Svendsen
23 Mar 2020
23 Mar 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GPrank: an R package for detecting dynamic elements from genome-wide time series

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics