A unified framework for finding differentially expressed genes from microarray experiments.

Jahangheer S Shaik,Mohammed Yeasin

doi:10.1186/1471-2105-8-347

Abstract

BackgroundThis paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.ResultsThe performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering. The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms. Further analyses on 3 real cancer datasets and 3 Parkinson's datasets show the similar improvement in performance. First, a 3 fold validation process is provided for the two-sample cancer datasets. In addition, the analysis on 3 sets of Parkinson's data is performed to demonstrate the scalability of the proposed method to multi-sample microarray datasets.ConclusionThis paper presents a unified framework for the robust selection of genes from the two-sample as well as multi-sample microarray experiments. Two different ranking methods used in module 1 bring diversity in the selection of genes. The conversion of ranks to p-values, the fusion of p-values and FDR analysis aid in the identification of significant genes which cannot be judged based on gene ranking alone. The 3 fold validation, namely, robustness in selection of genes using FDR analysis, clustering, and visualization demonstrate the relevance of the DEGs. Empirical analyses on 50 artificial datasets and 6 real microarray datasets illustrate the efficacy of the proposed approach. The analyses on 3 cancer datasets demonstrate the utility of the proposed approach on microarray datasets with two classes of samples. The scalability of the proposed unified approach to multi-sample (more than two sample classes) microarray datasets is addressed using three sets of Parkinson's Data. Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.

Highlights

This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data
Artificial datasets with ground truth information are used for the comparison of performance of unified framework with other gene selection methods
The data is divided into a training class containing 38 samples (27 Acute Lymphoid Leukemia (ALL) and 11 Acute Myeloid Leukemia (AML)) and a test class containing 34 samples of tissues (20 ALL and 14 AML)

Summary

Introduction

This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. Microarray experiments produce expression profiles measured under some experimental conditions and are normally labeled on the basis of external information such as, clinical identification of samples or expression of genes with respect to time [1]. The gene selection can be a challenging issue as the microarray data is skewed with a large number of genes in one dimension and a few samples in the other dimension. There is a large volume of biological and technical noise that must be normalized to generate a more uniform measure

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Sep 18, 2007
Citations: 40	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A unified framework for finding differentially expressed genes from microarray experiments.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Gene selection algorithms for microarray data based on least squares support vector machine.
E Ke Tang ... Pn Suganthan
BMC Bioinformatics | VOL. 7
E Ke Tang, et. al.E Ke Tang ... Pn Suganthan
27 Feb 2006
BMC Bioinformatics | VOL. 7

A Unified Framework To Find Differentially Expressed Genes from Microarray Experiments
Jahangheer Shaik ... Mohammed Yeasin
-
Jahangheer Shaik, et. al.Jahangheer Shaik ... Mohammed Yeasin
01 Aug 2007
01 Aug 2007

CXCL10 and its related key genes as potential biomarkers for psoriasis: Evidence from bioinformatics and real-time quantitative polymerase chain reaction.
Ailing Zou ... Qichao Jian
Medicine | VOL. 100
Ailing Zou, et. al.Ailing Zou ... Qichao Jian
24 Sep 2021
Medicine | VOL. 100

Identification and Validation of Potential Biomarkers and Pathways for Idiopathic Pulmonary Fibrosis by Comprehensive Bioinformatics Analysis.
Weibin Qian ... Xinrui Cai
BioMed research international | VOL. 2021
Weibin Qian, et. al.Weibin Qian ... Xinrui Cai
01 Jan 2020
BioMed research international | VOL. 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A unified framework for finding differentially expressed genes from microarray experiments.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics