Abstract

We present NetCore, a novel network propagation approach based on node coreness, for phenotype–genotype associations and module identification. NetCore addresses the node degree bias in PPI networks by using node coreness in the random walk with restart procedure, and achieves improved re-ranking of genes after propagation. Furthermore, NetCore implements a semi-supervised approach to identify phenotype-associated network modules, which anchors the identification of novel candidate genes at known genes associated with the phenotype. We evaluated NetCore on gene sets from 11 different GWAS traits and showed improved performance compared to the standard degree-based network propagation using cross-validation. Furthermore, we applied NetCore to identify disease genes and modules for Schizophrenia GWAS data and pan-cancer mutation data. We compared the novel approach to existing network propagation approaches and showed the benefits of using NetCore in comparison to those. We provide an easy-to-use implementation, together with a high confidence PPI network extracted from ConsensusPathDB, which can be applied to various types of genomics data in order to obtain a re-ranking of genes and functionally relevant network modules.

Highlights

  • The analysis of genome-wide molecular data is a complex task and protein–protein interaction (PPI) networks, i.e. the graphical representation of the physical contacts between proteins in a cell, have emerged as a powerful scaffold for integrating different data types and boosting the signal-tonoise ratio of such experiments [1]

  • (1) The first step includes the extraction of a Protein-protein interaction (PPI) network, which we obtained from the ConsensusPathDB database

  • (2) In the step network propagation based on a random walk with restart is applied, such that a normalization step based on node coreness is implemented, and a final re-ranking of the nodes is obtained, along with a significance assignment based on degree-preserving randomized networks

Read more

Summary

Introduction

The analysis of genome-wide molecular data is a complex task and protein–protein interaction (PPI) networks, i.e. the graphical representation of the physical contacts between proteins in a cell, have emerged as a powerful scaffold for integrating different data types and boosting the signal-tonoise ratio of such experiments [1]. Network propagation allows combining experimental data with molecular interaction information, such that the topology of the network is used to propagate the data effects throughout the network, and by that amplifying and functionally interpreting the experimental data This approach covers a wide range of data domains and has been applied, for example, for associating genetic variants with cancer (sub-) phenotypes [2] as well as for deriving patient-specific networks from phosphoproteome analysis [3]. The process can be executed, usually in the context of a PPI network, such that after the propagation new weights are obtained for the genes (see Supplementary Methods) These weights can be used to re-rank the genes in order to identify novel disease genes, or as an input for a further module identification step, to identify sub-networks which can be associated with the phenotype under study. Several network propagation approaches have already been used, for example for the identification of novel disease genes [4,5,6], the discovery of disease-associated network modules [2,7,8] and the prediction of drug-targets [9]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.