Abstract

The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.

Highlights

  • Analyzing gene expression across one or more biological systems is a complex challenge for experimental design, computational resource requirements, and biological interpretation

  • Random Matrix Theory (RMT) used by RMTGeneNet examines changes in the nearest neighbor spacing distribution (NNSD) of eigenvalues from the similarity matrix

  • It has been shown that the NNSD of eigenvalues of any random matrix appears as a Gaussian orthogonal ensemble (GOE) distribution, and the distribution of a non-random matrix appears Poisson [27]

Read more

Summary

Introduction

Analyzing gene expression across one or more biological systems is a complex challenge for experimental design, computational resource requirements, and biological interpretation. In co-expression networks, nodes represent gene products (e.g. mRNA transcripts) and edges indicate a significant correlation of expression between a gene pair (co-expression). Groups of nodes that are highly connected (and correlated) indicate a biological relationship and can be separated into cofunctional gene interaction modules. Many methods for construction of co-expression networks compare gene expression measurements from samples across multiple experimental conditions using a correlation statistic. When the behavior of input data does not match these correlation methods, mutual information functions (MI) can be calculated to determine the relationships among genes. Once a statistical method has been chosen, an n-transcript by m-sample expression matrix is used as input for pair-wise correlation analysis resulting in an n6n matrix of correlation values—a similarity matrix

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.