Abstract

BackgroundExtensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type.ResultsSince entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2.ConclusionsIn order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.

Highlights

  • Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks

  • These either assign each data point to several hard clusters [10] or determine a degree of membership to each cluster [11,12]. Such methods are known as fuzzy clustering, but have not been applied to the common biological case of k-partite graphs. To overcome these difficulties we developed a novel fuzzy clustering algorithm based on a non-negative matrix factorization (NMF) model [13]

  • It is similar in structure to multiplicative algorithms for NMF, with the difference that we address a three-matrix factorization problem, and have to deal with a multi-summand cost function

Read more

Summary

Introduction

Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. The human disease network presented in [3] is an example for a bipartite graph having two disjoint sets of vertices. Structural questions need to be addressed outside of the unipartite graph setting. Other examples of bipartite networks are protein complex or gene-localization, gene-function or microRNA-target networks. The integration of such network data leads to complex k-partite graphs

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.