Abstract
Module network inference is an established statistical method to reconstruct co-expression modules and their upstream regulatory programs from integrated multi-omics datasets measuring the activity levels of various cellular components across different individuals, experimental conditions or time points of a dynamic process. We have developed Lemon-Tree, an open-source, platform-independent, modular, extensible software package implementing state-of-the-art ensemble methods for module network inference. We benchmarked Lemon-Tree using large-scale tumor datasets and showed that Lemon-Tree algorithms compare favorably with state-of-the-art module network inference software. We also analyzed a large dataset of somatic copy-number alterations and gene expression levels measured in glioblastoma samples from The Cancer Genome Atlas and found that Lemon-Tree correctly identifies known glioblastoma oncogenes and tumor suppressors as master regulators in the inferred module network. Novel candidate driver genes predicted by Lemon-Tree were validated using tumor pathway and survival analyses. Lemon-Tree is available from http://lemon-tree.googlecode.com under the GNU General Public License version 2.0.
Highlights
Recent years have witnessed a dramatic increase in new technologies for interrogating the activity levels of various cellular components on a genome-wide scale, including genomic, epigenomic, transcriptomic, and proteomic information [1]
We have previously shown that running a two-way clustering algorithm until convergence, and thereafter identifying the regulatory programs that give rise to the inferred condition clusterings for each gene module results in higher module network model likelihoods and reduced computational cost compared to the traditional approach of iteratively updating gene modules and regulator assignments [14, 16]
We compared the performance of Lemon-Tree with CONEXIC (COpy Number and Expression In Cancer), a state-of-the-art module network algorithm designed to integrate matched copy number and gene expression data from tumor samples [12]
Summary
Recent years have witnessed a dramatic increase in new technologies for interrogating the activity levels of various cellular components on a genome-wide scale, including genomic, epigenomic, transcriptomic, and proteomic information [1]. We benchmarked Lemon-Tree using large-scale datasets of somatic copy-number alterations and gene expression levels measured in glioblastoma samples from The Cancer Genome Atlas and found that Lemon-Tree compares favorably with existing module network softwares and correctly identifies known glioblastoma oncogenes and tumor suppressors as master regulators in the inferred module network.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.