Abstract

BackgroundWe develop a new concept that reflects how genes are connected based on microarray data using the coefficient of determination (the squared Pearson correlation coefficient). Our gene rank combines a priori knowledge about gene connectivity, say, from the Gene Ontology (GO) database, and the microarray expression data at hand, called the microarray enriched gene rank, or simply gene rank (GR). GR, similarly to Google PageRank, is defined in a recursive fashion and is computed as the left maximum eigenvector of a stochastic matrix derived from microarray expression data. An efficient algorithm is devised that allows computation of GR for 50 thousand genes with 500 samples within minutes on a personal computer using the public domain statistical package R.ResultsComputation of GR is illustrated with several microarray data sets. In particular, we apply GR (1) to answer whether bad genes are more connected than good genes in relation with cancer patient survival, (2) to associate gene connectivity with cluster/subtypes in ovarian cancer tumors, and to determine whether gene connectivity changes (3) from organ to organ within the same organism and (4) between organisms.ConclusionsWe have shown by examples that findings based on GR confirm biological expectations. GR may be used for hypothesis generation on gene pathways. It may be used for a homogeneous sample or for comparison of gene connectivity among cases and controls, or in longitudinal setting.

Highlights

  • The key element of Google’s success is the PageRank that determines the order of webpages displayed for a keyword search

  • The goal of this work is to equip the researcher with a new concept, microarray enriched gene rank (GR), that combines a priori knowledge about gene connectivity with researcher-derived microarray data, and can be computed on his/her own personal computer with as many as 50,000 genes and 500 samples within few minutes

  • The goal of this section is to understand how this phenotype-based ranking is related to our microarray enriched gene rank reflecting the gene connectivity

Read more

Summary

Introduction

The key element of Google’s success is the PageRank that determines the order of webpages displayed for a keyword search. Two versions of GeneRank are suggested: (1) using GO annotation or (2) Pearson correlation coefficient, r. In both cases, the entries of the connectivity matrix are binary: 0 genes are not connected and 1 if genes are connected. In the correlation version (2), the genes defined connected if r > 0.5. We develop a new concept that reflects how genes are connected based on microarray data using the coefficient of determination (the squared Pearson correlation coefficient). GR, to Google PageRank, is defined in a recursive fashion and is computed as the left maximum eigenvector of a stochastic matrix derived from microarray expression data. An efficient algorithm is devised that allows computation of GR for 50 thousand genes with 500 samples within minutes on a personal computer using the public domain statistical package R

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.