Abstract

Self-Caring IT systems are those that can proactively avoid system failures rather than reactively handle failures after they have occurred. In this paper, we are interested in failures in which a MapReduce job is unable to execute within an SLA-based completion time. The existing fault tolerance capability provided by Map Reduce frameworks is simple and the penalty associated with handling failures could potentially lead to excessive job execution times. Our goal in this paper is to bring out the severity of this penalty for different job characteristics and configurable framework parameters. We first quantitatively evaluate the penalty in execution time associated with node failures in the open-source MapReduce framework, Hadoop using the MRPerf simulator. This increase in execution time is particularly expensive in pay-as-you-go cloud infrastructures where users are charged by resource usage duration. Our solution minimizes job-completion-time SLA violations by augmenting the existing fault-tolerance capability of the MapReduce framework using a dynamic resource scaling approach. This resource scaling approach leverages the elastic properties of a cloud, in order to mitigate execution time penalties and hence proactively avoids a potential job failure. Using our proposed approach for various job and framework parameters, we show that performance penalties can be decreased by up to 78% in the case of single-node failures and by up to 100% in the case of 4-node failures at minimal additional cost.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.