Abstract

Networks are useful tools to represent and analyze interactions on a large, or genome-wide scale and have therefore been widely used in biology. Many biological networks—such as those that represent regulatory interactions, drug-gene, or gene-disease associations—are of a bipartite nature, meaning they consist of two different types of nodes, with connections only forming between the different node sets. Analysis of such networks requires methodologies that are specifically designed to handle their bipartite nature. Community structure detection is a method used to identify clusters of nodes in a network. This approach is especially helpful in large-scale biological network analysis, as it can find structure in networks that often resemble a “hairball” of interactions in visualizations. Often, the communities identified in biological networks are enriched for specific biological processes and thus allow one to assign drugs, regulatory molecules, or diseases to such processes. In addition, comparison of community structures between different biological conditions can help to identify how network rewiring may lead to tissue development or disease, for example. In this mini review, we give a theoretical basis of different methods that can be applied to detect communities in bipartite biological networks. We introduce and discuss different scores that can be used to assess the quality of these community structures. We then apply a wide range of methods to a drug-gene interaction network to highlight the strengths and weaknesses of these methods in their application to large-scale, bipartite biological networks.

Highlights

  • Many processes in biology are linked through complex patterns of physical and functional interactions, which can be represented in large-scale, genome-wide biological networks

  • Information Comparison Because we lack a ground-truth for this network, we cannot assess the quality of results in terms of discovering a previously known community structure

  • We found that the scores were similar, and contained within the [0.6077, 0.7746] range, indicating that the community assignments share a high amount of information

Read more

Summary

INTRODUCTION

Many processes in biology are linked through complex patterns of physical and functional interactions, which can be represented in large-scale, genome-wide biological networks Analysis of these networks can help our understanding of biology and medicine (Barabási et al, 2011). We assess the performance of these methods on a largescale, near genome-wide, gene-drug interaction network and discuss the feasibility of applying these methods to genome-wide networks We hope this overview will help shed light on the challenges with community detection in genome-wide networks in general, as well as on the advantages and disadvantages of applying some of the most widely-used community detection methods to large-scale bipartite genomic networks

PROBLEM DEFINITION
Unipartite Modularity
Bipartite Modularity Scores
Resolution
COMMUNITY DETECTION STRATEGIES
Projections and Adapted Unipartite Methods
Overlapping Community Detection
APPLICATION TO A GENE-DRUG INTERACTION NETWORK
Preparation of the Network
Application of the Methods
Objective function
Results
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call