Improved algorithms for intermediate dataset storage in a cloud-based dataflow

Jie Cheng,Daming Zhu,Binhai Zhu

doi:10.1016/j.tcs.2016.05.042

Improved algorithms for intermediate dataset storage in a cloud-based dataflow

Jie Cheng, Daming Zhu + Show 1 more

Open Access

https://doi.org/10.1016/j.tcs.2016.05.042

Copy DOI

Journal: Theoretical Computer Science	Publication Date: Jun 6, 2016
Citations: 5	License type: publisher-specific-oa

Affiliation: Shandong University, Montana State University

#Intermediate Storage #Maximum Cost + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In order to run a dataflow with as low cost as possible, it is often faced with deciding which data-sets in a data-set sequence should be stored, with the rest regenerated. The Intermediate Data-set Storage problem arises from this situation. The current best algorithm for this problem takes O(n4) time. In this paper, we present two improved algorithms for this problem, the first of which can achieve a time complexity O(n2), the second of which O(rn), where n is the number of data-sets in a dataflow, r is a numerical number which indicates how large it is for the maximum storage cost to be divided by the minimum computation cost in the dataflow.

Full Text