Abstract

Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.

Highlights

  • In recent years, high-throughput biological experimental techniques[1, 2] have generated massive omic data sources at the molecular level, such as protein-protein interaction data[3], gene co-expression data[4], and transcriptional regulation data[5]

  • We propose an algorithm based on information entropy [32,33,34,35,36] to integrate multiple weighted gene association networks (WGANs)

  • Networks HumanNet, FunCoup and STRING come mainly from varieties of databases constructed by fusing physical interaction data and functional association data by log likelihood scoring methods or naive Bayesian framework

Read more

Summary

Introduction

High-throughput biological experimental techniques[1, 2] have generated massive omic data sources at the molecular level, such as protein-protein interaction data[3], gene co-expression data[4], and transcriptional regulation data[5]. Arduous efforts have been dedicated to unravel the interplays between all genes in organisms by integrating these data into interaction networks [6,7,8,9,10]. In these networks, nodes represent genes, edges represent interactions between genes, and edge weights are evidence scores of the interactions fused from various biological data sources[11, 12]. There are two main methods to integrate various biological functional data into a comprehensive network. One is subjective scoring integration method, and the other is statistical inference scoring algorithm

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.