Abstract

Abstract The vertex-centric programming model is now widely used for processing large graphs. User-defined vertex programs are executed in parallel over every vertex of a graph, but the imperative and explicit message-passing style of existing systems makes defining a vertex program unintuitive and difficult. This article presents Fregel, a purely functional domain-specific language for processing large graphs and describes its model, design, and implementation. Fregel is a subset of Haskell, so Haskell tools can be used to test and debug Fregel programs. The vertex-centric computation is abstracted using compositional programming that uses second-order functions on graphs provided by Fregel. A Fregel program can be compiled into imperative programs for use in the Giraph and Pregel+ vertex-centric frameworks. Fregel’s functional nature without side effects enables various transformations and optimizations during the compilation process. Thus, the programmer is freed from the burden of program optimization, which is manually done for existing imperative systems. Experimental results for typical examples demonstrated that the compiled code can be executed with reasonable and promising performance.

Highlights

  • The rapid growth of large-scale data is driving demand for efficient processing of the data to obtain valuable knowledge

  • We present Fregel, a functional domain-specific language (DSL) for declarative-style programming on large graphs that is based on the pulling-style vertex-centric model

  • We show that a Fregel program can be compiled into a program for two vertexcentric frameworks through an intermediate representation (IR) that is independent of the target framework

Read more

Summary

Introduction

The rapid growth of large-scale data is driving demand for efficient processing of the data to obtain valuable knowledge. Typical instances of large-scale data are large graphs such as social networks, road networks, and consumer purchase histories. Graphs are becoming more and more prevalent, highly efficient large-graph processing is becoming more and more important. A quite natural solution for dealing with large graphs is to use parallel processing. Developing efficient parallel programs is not an easy task, because subtle programming mistakes lead to fatal errors such as deadlock and to nondeterministic results

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call