Abstract

Graphs are the cornerstone of many algorithms pertaining to various network analyses. When the problem's dimensionality is relatively small, expressed in the number of vertices and edges of a graph, then most methods perform adequately well. As the problem size increases, more compute power is required. Distributed computing is a one viable option to address this issue, but it cannot scale indefinitely. At one point, it is necessary to turn to heuristic approaches. Spectral graph theory is an example of such approximate scheme. In this paper, we combine spectral analysis with distributed computing using Apache Spark. The paper is accompanied with a publicly available proof of concept implementation. The system was extensively performance tested, and the results show a superb fit of Apache Spark to the purpose of spectral graph analysis. Furthermore, the resulting code is straightforward thankfully to Spark's intuitive distributed programming model, and well-designed APIs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call