BFL: a node and edge betweenness based fast layout algorithm for large scale networks

Tatsunori B Hashimoto,Satoru Miyano,Masao Nagasaki,Kaname Kojima

doi:10.1186/1471-2105-10-19

Tatsunori B Hashimoto, Satoru Miyano + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-10-19

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jan 15, 2009
Citations: 46	License type: cc-by

Affiliation: Harvard College Observatory

Abstract

BackgroundNetwork visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements.ResultsTo overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths.ConclusionOur BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer.

Highlights

Network visualization would serve as a useful first step for analysis
We present a new node and edge betweenness based fast layout algorithm (BFL) and the specific score methods
The algorithm was implemented in Java with files stored in Cell System Markup Language (CSML) format [25]

Summary

Introduction

Current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements. Extensive research has been done on numerical and statistical methods to infer the relationship among genes, which we call gene networks, methods for analyzing such data visualizing large gene networks has received less attention. Metabolic networks with relatively small numbers of nodes (

Methods

Results

Conclusion