RankProd 2.0: a refactored bioconductor package for detecting differentially expressed features in molecular profiling datasets.

Francesco Del Carratore,Rainer Breitling,Fangxin Hong,Andris Jankevics,Rob Eisinga,Tom Heskes,Ziv Bar-Joseph

doi:10.1093/bioinformatics/btx292

Francesco Del Carratore, Rainer Breitling + Show 5 more

Open Access

https://doi.org/10.1093/bioinformatics/btx292

Copy DOI

Abstract

MotivationThe Rank Product (RP) is a statistical technique widely used to detect differentially expressed features in molecular profiling experiments such as transcriptomics, metabolomics and proteomics studies. An implementation of the RP and the closely related Rank Sum (RS) statistics has been available in the RankProd Bioconductor package for several years. However, several recent advances in the understanding of the statistical foundations of the method have made a complete refactoring of the existing package desirable.ResultsWe implemented a completely refactored version of the RankProd package, which provides a more principled implementation of the statistics for unpaired datasets. Moreover, the permutation-based P-value estimation methods have been replaced by exact methods, providing faster and more accurate results.Availability and implementationRankProd 2.0 is available at Bioconductor (https://www.bioconductor.org/packages/devel/bioc/html/RankProd.html) and as part of the mzMatch pipeline (http://www.mzmatch.sourceforge.net).Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

Finding differentially expressed molecular features when comparing different conditions plays a pivotal role in all kinds of molecular profiling studies (“omics”)
Provided that unpaired datasets are increasingly common, we developed a more principled approach described in Section 4, which provides a more reliable application of Rank Product (RP) and Rank Sum (RS) in the analysis of unpaired datasets
We introduce a method for the exact calculation of the RS p-values. This is derived from the simple observation that under the null hypothesis, the probability distribution of the RS, in an experiment with N variables and K replicates, is exactly the same as the probability distribution of the sum of the outcomes obtained by rolling K dice with N faces

Summary

Introduction

Finding differentially expressed molecular features when comparing different conditions plays a pivotal role in all kinds of molecular profiling studies (“omics”). The main identified weakness of the RP method is its sensitivity to variable-specific measurement variance This problem has been successfully addressed by a number of variance stabilizing normalization techniques (Durbin et al, 2002; Huber et al, 2002; Breitling and Herzyk, 2005). The p-value estimation had been performed by a permutation-based method for both statistics (Hong et al, 2006). This method requires a computationally demanding number of permutations in order to obtain accurate results and, when dealing with the tails of the distribution (i.e. the most interesting molecular features), the estimates are unreliable. Provided that unpaired datasets are increasingly common, we developed a more principled approach described in Section 4, which provides a more reliable application of RP and RS in the analysis of unpaired datasets

P-values estimation for the Rank Product

P-values estimation for the Rank Sum

Application to unpaired datasets

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: May 8, 2017
Citations: 111	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

RankProd 2.0: a refactored bioconductor package for detecting differentially expressed features in molecular profiling datasets.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Assessment and Improvement of Statistical Tools for Comparative Proteomics Analysis of Sparse Data Sets with Few Experimental Replicates
Veit Schwämmle ... Ileana Rodríguez León
Journal of Proteome Research | VOL. 12
Veit Schwämmle, et. al.Veit Schwämmle ... Ileana Rodríguez León
05 Aug 2013
Journal of Proteome Research | VOL. 12

Molecular Profiling of Clinical Tissue Specimens: Feasibility and Applications
...
The Journal of Molecular Diagnostics | VOL. 2
, et. al. ...
01 May 2000
The Journal of Molecular Diagnostics | VOL. 2

A pilot study utilizing molecular profiling to find potential targets and select individualized treatments for patients with metastatic breast cancer.
Gayle S Jameson ... Nicholas J Robert
Journal of Clinical Oncology | VOL. 31
Gayle S Jameson, et. al.Gayle S Jameson ... Nicholas J Robert
20 May 2013
Journal of Clinical Oncology | VOL. 31

A Systematic Literature Review on the Product Ranking Methods
Ahmad Choirun Najib ... Nur Aini Rakhmawati
Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika | VOL. 5
Ahmad Choirun Najib, et. al.Ahmad Choirun Najib ... Nur Aini Rakhmawati
30 Jun 2019
Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RankProd 2.0: a refactored bioconductor package for detecting differentially expressed features in molecular profiling datasets.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics