Abstract

Sub-networks can expose complex patterns in an entire bio-molecular network by extracting interactions that depend on temporal or condition-specific contexts. When genes interact with each other during cellular processes, they may form differential co-expression patterns with other genes across different cell states. The identification of condition-specific sub-networks is of great importance in investigating how a living cell adapts to environmental changes. In this work, we propose the weighted MAXimum clique (WMAXC) method to identify a condition-specific sub-network. WMAXC first proposes scoring functions that jointly measure condition-specific changes to both individual genes and gene-gene co-expressions. It then employs a weaker formula of a general maximum clique problem and relates the maximum scored clique of a weighted graph to the optimization of a quadratic objective function under sparsity constraints. We combine a continuous genetic algorithm and a projection procedure to obtain a single optimal sub-network that maximizes the objective function (scoring function) over the standard simplex (sparsity constraints). We applied the WMAXC method to both simulated data and real data sets of ovarian and prostate cancer. Compared with previous methods, WMAXC selected a large fraction of cancer-related genes, which were enriched in cancer-related pathways. The results demonstrated that our method efficiently captured a subset of genes relevant under the investigated condition.

Highlights

  • A central problem in network biology is the identification of genes and pathways involved in the same biological processes or physiological conditions

  • For the analysis of ovarian cancer, we considered only the genes that were included in the protein-protein interaction (PPI) network, which consists of 8,721 genes and 33,771 interactions

  • We showed the performance of weighted MAXimum clique (WMAXC) on simulated data by comparing it with COSINE [12], because COSINE was initially compared to several other methods, including jActiveModules [6], an edge-based method [10], and a local method [15]

Read more

Summary

Introduction

A central problem in network biology is the identification of genes and pathways involved in the same biological processes or physiological conditions. Network structures often have been used to describe these complex bio-molecular pathways and functional modules by representing a whole set of interactions as overlapping sub-networks, each associated with a specific condition [1,2]. Many methods have been developed to construct bio-molecular networks by comparing multiple sets of microarray data under different conditions. Because expressions of different genes in a series of biological conditions influence each other, correlations between genes have been widely used to analyze microarray geneexpression measurements. Waaijenborg and Zwinderman [3] developed a penalized canonical correlation analysis method to extract a subset of variables that capture the common features among genes by maximizing a canonical correlation between expression of genes. Witten and Tibshirani [4] presented some extension formulas to the sparse canonical correlation analysis as a supervised method, which resulted in the identification of linear combinations of sets of variables that are correlated and associated with its outcome

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.