Abstract
Gaussian graphical models (GGMs) are a popular form of network model in which nodes represent features in multivariate normal data and edges reflect conditional dependencies between these features. GGM estimation is an active area of research. Currently available tools for GGM estimation require investigators to make several choices regarding algorithms, scoring criteria, and tuning parameters. An estimated GGM may be highly sensitive to these choices, and the accuracy of each method can vary based on structural characteristics of the network such as topology, degree distribution, and density. Because these characteristics are a priori unknown, it is not straightforward to establish universal guidelines for choosing a GGM estimation method. We address this problem by introducing SpiderLearner, an ensemble method that constructs a consensus network from multiple estimated GGMs. Given a set of candidate methods, SpiderLearner estimates the optimal convex combination of results from each method using a likelihood-based loss function. -fold cross-validation is applied in this process, reducing the risk of overfitting. In simulations, SpiderLearner performs better than or comparably to the best candidate methods according to a variety of metrics, including relative Frobenius norm and out-of-sample likelihood. We apply SpiderLearner to publicly available ovarian cancer gene expression data including 2013 participants from 13 diverse studies, demonstrating our tool's potential to identify biomarkers of complex disease. SpiderLearner is implemented as flexible, extensible, open-source code in the R package ensembleGGM at https://github.com/katehoffshutta/ensembleGGM.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.