Abstract

Market-based IaaS offers such as Amazon's EC2 Spot Instances represent a cost-efficient way to operate a cluster. Compared to traditional IaaS offers which follow a fixed pricing scheme, the per hour price of Spot Instances changes dynamically, whereas the Spot price is often significantly less when compared to On-demand and even the Reserved Instances. When deploying a Parallel Data-Processing Engine (PDE) on a cluster of Spot Instances a major obstacle is to find a bidding strategy that is optimal for a given workload and satisfies user constraints such as the maximal budget. Moreover, another obstacle is that existing PDEs implement rigid fault-tolerance schemes which do not adapt to different failure rates resulting from different bidding strategies. In this paper, we present a novel PDE called Spotgres that tackles these issues. Spotgres extends a typical PDE architecture by (1) a constraint-based bid advisor which finds an optimal cluster configuration (i.e., a set of bids on Spot Instances) and (2) a cost-based fault-tolerance scheme that takes various parameters (such as the mean time between failures and query statistics) into account to efficiently execute analytical queries over the set of Spot Instances that have a varying failure rate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.