Discovering Pair-wise Synergies in Microarray Data.

Yuan Chen,Jun Gao,Zheming Yuan,Dan Cao,Jun Gao

doi:10.1038/srep30672

Abstract

Informative gene selection can have important implications for the improvement of cancer diagnosis and the identification of new drug targets. Individual-gene-ranking methods ignore interactions between genes. Furthermore, popular pair-wise gene evaluation methods, e.g. TSP and TSG, are helpless for discovering pair-wise interactions. Several efforts to discover pair-wise synergy have been made based on the information approach, such as EMBP and FeatKNN. However, the methods which are employed to estimate mutual information, e.g. binarization, histogram-based and KNN estimators, depend on known data or domain characteristics. Recently, Reshef et al. proposed a novel maximal information coefficient (MIC) measure to capture a wide range of associations between two variables that has the property of generality. An extension from MIC(X; Y) to MIC(X1; X2; Y) is therefore desired. We developed an approximation algorithm for estimating MIC(X1; X2; Y) where Y is a discrete variable. MIC(X1; X2; Y) is employed to detect pair-wise synergy in simulation and cancer microarray data. The results indicate that MIC(X1; X2; Y) also has the property of generality. It can discover synergic genes that are undetectable by reference feature selection methods such as MIC(X; Y) and TSG. Synergic genes can distinguish different phenotypes. Finally, the biological relevance of these synergic genes is validated with GO annotation and OUgene database.

Highlights

Informative gene selection can have important implications for the improvement of cancer diagnosis and the identification of new drug targets
Pair-wise gene evaluation has been implemented in several popular algorithms, including top scoring pair (TSP)[8,9], top scoring genes (TSG)[2], and doublets[7], which all compare expression values of the same sample between two different genes
Let X and Y be two independent, random variables and Y is binarized with a median, maximal information coefficient (MIC)(X; Y) = 0.1702 ± 0.0292

Summary

Results

Generality of MIC(X1; X2; Y) according to simulation analysis. If X1 and X2 are statistically independent of Y, MIC(X1; X2; Y) should be close to 0. Each reference method ranks the top 200 genes (Top200s) for each dataset (Top200s are shown in the Supplementary Material Table S1-S3). We can observe significant overlaps between the Top 200s selected by the four reference methods, as shown in Figs 6, 7 and 8 This indicates that a considerable number of similar informative genes can be detected by these reference methods. MRMR, SVM-RFE and TSG are not individual-gene-filter methods; the Top200s selected by them have considerable similarities to the Top200s selected by MIC(X; Y). This indicates that these methods can efficiently discover genes that are individually discriminant, but not specific to the genes have pair-wise synergy effects.

Dataset Prostate Lung DLBCL

Discussion

Adrenal adenoma

Validation accuracy Validation MCC

Author Contributions

Additional Information

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific reports	Publication Date: Jul 29, 2016
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Discovering Pair-wise Synergies in Microarray Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports

Lead the way for us

Similar Papers

Equitability, mutual information, and the maximal information coefficient.
Justin B Kinney ... Gurinder S Atwal
Proceedings of the National Academy of Sciences | VOL. 111
Justin B Kinney, et. al.Justin B Kinney ... Gurinder S Atwal
18 Feb 2014
Proceedings of the National Academy of Sciences | VOL. 111

A new estimate of mutual information based measure of dependence between two variables: properties and fast implementation
Namita Jain ... C A Murthy
International Journal of Machine Learning and Cybernetics | VOL. 7
Namita Jain, et. al.Namita Jain ... C A Murthy
10 Sep 2015
International Journal of Machine Learning and Cybernetics | VOL. 7

Reconstruction of gene network through Backward Elimination based Information-Theoretic Inference with Maximal Information Coefficient
Animesh Kumar Paul ... Pintu Chandra Shill
-
Animesh Kumar Paul, et. al.Animesh Kumar Paul ... Pintu Chandra Shill
01 Jan 2017
01 Jan 2017

Identification of potential drug targets and vaccine candidates in Clostridium botulinum using subtractive genomics approach.
Rati Sudha ... Purushottam Prasad
Bioinformation | VOL. 15
Rati Sudha, et. al.Rati Sudha ... Purushottam Prasad
31 Jan 2019
Bioinformation | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discovering Pair-wise Synergies in Microarray Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports