A highly cost-effective task scheduling strategy for very large graph computation

Yongli Cheng,Fang Wang,Hong Jiang,Yu Hua,Dan Feng,Yunxiang Wu,Tingwei Zhu,Wenzhong Guo

doi:10.1016/j.future.2018.07.010

Abstract

Existing distributed graph-processing frameworks, e.g., Pregel, GPS and Giraph, handle large-scale graphs in the memory of clusters built of commodity compute nodes for better scalability and performance. While capable of scaling out according to the size of graphs up to thousands of compute nodes, for graphs beyond a certain size, these frameworks would usually require investments of machines that are either beyond the financial capability of or unprofitable for most small and medium-sized organizations, making the deployment of their large-scale graph-computing jobs difficult if not impossible. At the other end of the spectrum of graph-processing frameworks research, the single-node disk-based graph-computing frameworks, such as GraphChi and XStream, handle large-scale graphs on just one commodity computer, leading to high efficiency in the use of hardware but at the cost of low user performance and limited scalability. Motivated by this dichotomy, in this paper we propose a pipeline-based task scheduling strategy with high cost-effectiveness. We use this scheduling strategy to design and implement a distributed disk-based graph-processing framework, called DD-Graph, that can process very large graphs with trillions of edges on a small cluster while achieving the high performance of existing distributed in-memory graph-processing frameworks. The evaluation of DD-Graph prototype, driven by very large graph datasets, shows that it saves 73% of GPS’ hardware costs while running 1.34x faster than GPS.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A highly cost-effective task scheduling strategy for very large graph computation

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Journal: Future Generation Computer Systems	Publication Date: Jul 18, 2018
Citations: 8

Similar Papers

DD-Graph
Yongli Cheng ... Yu Hua
-
Yongli Cheng, et. al.Yongli Cheng ... Yu Hua
31 May 2016
31 May 2016

Large scale graph processing systems: survey and an experimental evaluation
Omar Batarfi ... Sherif Sakr
Cluster Computing | VOL. 18
Omar Batarfi, et. al.Omar Batarfi ... Sherif Sakr
24 Jul 2015
Cluster Computing | VOL. 18

An Empirical Study of Task Scheduling Strategies for Image Processing Application on Heterogeneous Distributed Computing System
...
Scalable Computing Practice and Experience | VOL. 3
, et. al. ...
01 Jan 1999
Scalable Computing Practice and Experience | VOL. 3

NScale: neighborhood-centric large-scale graph analytics in the cloud
Abdul Quamar ... Jimmy Lin
The VLDB Journal | VOL. 25
Abdul Quamar, et. al.Abdul Quamar ... Jimmy Lin
13 Oct 2015
The VLDB Journal | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A highly cost-effective task scheduling strategy for very large graph computation

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems