Abstract

AbstractWe introduce NetworKit, an open-source software package for analyzing the structure of large complex networks. Appropriate algorithmic solutions are required to handle increasingly common large graph data sets containing up to billions of connections. We describe the methodology applied to develop scalable solutions to network analysis problems, including techniques like parallelization, heuristics for computationally expensive problems, efficient data structures, and modular software architecture. Our goal for the software is to package results of our algorithm engineering efforts and put them into the hands of domain experts. NetworKit is implemented as a hybrid combining the kernels written in C++ with a Python frontend, enabling integration into the Python ecosystem of tested tools for data analysis and scientific computing. The package provides a wide range of functionality (including common and novel analytics algorithms and graph generators) and does so via a convenient interface. In an experimental comparison with related software, NetworKit shows the best performance on a range of typical analysis tasks.

Highlights

  • Our experiments show that NetworKit is capable of quickly processing large-scale networks for a variety of analytics kernels, and does so faster and with a lower memory footprint than closely related software

  • Networks are as diverse as the series of questions we might ask of them—e. g., what is the largest connected component, what are the most central nodes in it and how do they connect to each other? A practical tool for network analysis should provide modular functions which do not restrict the user to predefined workflows

  • The NetworKit project exists at the intersection of graph algorithm research and network science

Read more

Summary

Design goals

There is a variety of software packages which provide graph algorithms in general and network analysis capabilities in particular (see Section 7 for a comparison to related packages). Algorithms and data structures are selected and implemented with high performance and parallelism in mind. A practical tool for network analysis should provide modular functions which do not restrict the user to predefined workflows. While NetworKit works with the standard Python 3 interpreter, calling the module from the IPython shell and Jupyter Notebook. HTML interface (Perez et al, 2013) allows us to integrate it into a fully fledged computing environment for scientific workflows, from data preparation to creating figures. As a Python module, NetworKit can be seamlessly integrated with Python libraries for scientific computing and data analysis, e. Gephi (Bastian et al, 2009) for graph visualization

Architecture
Framework foundations
Algorithm and implementation patterns
Parallelism
Heuristics and approximation algorithms
Efficient data structures
Modular design
Analytics
Global network properties
Node centrality
Partitioning the network
Network generators
Exploratory network analysis with network profiles
Comparison to related software
Performance evaluation
Benchmark
Comparative benchmark
10 Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.