Abstract
In a cloud-based cyber–physical system, many jobs consist of multiple parallel tasks. The cloud system usually adopts active task replication to improve performance and guarantee the reliability of a job. This technology creates redundant replicas for each task and then executes the replicas concurrently. In the cloud system, each replica is a virtual machine (VM) image that can be easily assigned to different physical machines (PMs) to overcome resource heterogeneity. However, how to design a rational task replication strategy (including replica creation and VM assignment) is indeed a complex issue. It should comprehensively consider correlations and tradeoffs among reliability, performance, and energy consumption. This paper first proposes a reliability–performance correlation model for a job executed by using active task replication. We design a general method to avoid analyzing complex failure correlations and give a Bayesian approach to calculate the performability metric of the job. The paper also proposes a reliability–energy correlation model to evaluate random energy consumption of a PM hosting multiple VMs by using mixed random variables. Finally, an expected net profit optimization model and a genetic algorithm are developed to search for an optimal task replication strategy balancing tradeoffs among reliability, performance, and energy consumption. Illustrative examples are provided.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have