MAID : An effect size based model for microarray data integration across laboratories and platforms

Ivan Borozan,Limin Chen,Bryan Paeper,Aled M Edwards,Ian D Mcgilvray,Zhaolei Zhang,Michael Katze,Jenny E Heathcote

doi:10.1186/1471-2105-9-305

Abstract

BackgroundGene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions. As microarray technology matures, the number of microarray studies has increased, resulting in many different datasets available for any given disease. The increase in sensitivity and reliability of measurements of gene expression changes can be improved through a systematic integration of different microarray datasets that address the same or similar biological questions.ResultsTraditional effect size models can not be used to integrate array data that directly compare treatment to control samples expressed as log ratios of gene expressions. Here we extend the traditional effect size model to integrate as many array datasets as possible. The extended effect size model (MAID) can integrate any array datatype generated with either single or two channel arrays using either direct or indirect designs across different laboratories and platforms. The model uses two standardized indices, the standard effect size score for experiments with two groups of data, and a new standardized index that measures the difference in gene expression between treatment and control groups for one sample data with replicate arrays. The statistical significance of treatment effect across studies for each gene is determined by appropriate permutation methods depending on the type of data integrated. We apply our method to three different expression datasets from two different laboratories generated using three different array platforms and two different experimental designs. Our results indicate that the proposed integration model produces an increase in statistical power for identifying differentially expressed genes when integrating data across experiments and when compared to other integration models. We also show that genes found to be significant using our data integration method are of direct biological relevance to the three experiments integrated.ConclusionHigh-throughput genomics data provide a rich and complex source of information that could play a key role in deciphering intricate molecular networks behind disease. Here we propose an extension of the traditional effect size model to allow the integration of as many array experiments as possible with the aim of increasing the statistical power for identifying differentially expressed genes.

Highlights

Gene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions
All three datasets compared normal liver tissue to liver tissue chronically infected with hepatitis C virus (HCV)
The second row designates the number of genes from the first row that are either up regulated in HCV samples compared to Normal with |fold| ≥ 1.5 for both of these categories "immune response" p-value = 4.96e-6 (MAID) vs p-value = 1.84e-5 and "defense response" p-value = 1.81e-5 (MAID) vs p-value = 3.54e-5. These results indicate that when gene sets selected by the model from Rhodes et al [13] are compared by those selected by MAID, the larger MAID gene set improves the enrichment significance of the two of the most significant and HCV relevant GO categories and points to an increase in statistical power when compared to the model proposed by Rhodes et al [13]

Summary

Introduction

Gene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions. The increase in sensitivity and reliability of measurements of gene expression changes can be improved through a systematic integration of different microarray datasets that address the same or similar biological questions. Different statistical and computational tools are used in the analysis of the microarray results Due to these differences it is challenging to extract reproducible, biologically meaningful information from different DNA microarray experiments that address the same, or very similar biological questions. The first consists of integrating summary measures of gene expression measurements across studies The advantage of this type of approach is that it avoids the need for estimating the inter-study variability and the issue of cross-platform normalization. Parmigiani et al [14] proposed a different model that uses a correlation-based method to search for consistent gene expression patterns across multiple studies.

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jul 10, 2008
Citations: 34	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

MAID : An effect size based model for microarray data integration across laboratories and platforms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

УПРАВЛІННЯ ТА ІНТЕГРАЦІЯ ДАНИХ В УМОВАХ ЦИФРОВІЗАЦІЇ ЕКОНОМІЧНИХ ПРОЦЕСІВ: ВИКЛИКИ ТА ПЕРСПЕКТИВИ
Nataliia Kasyanova ... Vladyslav Okhrimenko
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Vladyslav Okhrimenko
01 Jan 2023
Economical | VOL. 1

DATA MANAGEMENT AND INTEGRATION IN THE CONTEXT OF DIGITALIZATION OF ECONOMIC PROCESSES: CHALLENGES AND PROSPECTS
Nataliia Kasyanova ... Vladyslav Okhrimenko
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Vladyslav Okhrimenko
01 Jan 2023
Economical | VOL. 1

Bookreview of principles of data integration
Martin Telefont
Frontiers in Neuroinformatics | VOL. 7
Martin TelefontMartin Telefont
01 Jan 2013
Frontiers in Neuroinformatics | VOL. 7

New and old regulators of uterine leiomyoma growth from screening with DNA arrays
John C.M Tsibris ... William N Spellacy
Fertility and Sterility | VOL. 80
John C.M Tsibris, et. al.John C.M Tsibris ... William N Spellacy
01 Aug 2003
Fertility and Sterility | VOL. 80

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MAID : An effect size based model for microarray data integration across laboratories and platforms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics