Coded distributed computing (CDC) has recently emerged to be a promising solution to address the straggling effects in conventional distributed computing systems. By assigning redundant workloads to the computing nodes, CDC can significantly enhance the performance of the whole system. However, since the core idea of CDC is to introduce redundancies to compensate for uncertainties, it may lead to a large amount of wasted energy at the edge nodes. It can be observed that the more redundant workload added, the less impact the straggling effects have on the system. However, at the same time, the more energy is needed to perform redundant tasks. In this work, we develop a novel framework, namely CERA, to elastically allocate computing resources for CDC processes. Particularly, CERA consists of two stages. In the first stage, we model a joint coding and node selection optimization problem to minimize the expected processing time for a CDC task. Since the problem is NP-hard, we propose a linearization approach and a hybrid algorithm to quickly obtain the optimal solutions. In the second stage, we develop a smart online approach based on Lyapunov optimization to dynamically turn off straggling nodes based on their actual performance. As a result, wasteful energy consumption can be significantly reduced with minimal impact on the total processing time. Simulations using real-world datasets have shown that our proposed approach can reduce the system’s total processing time by more than 200% compared to that of the state-of-the-art approach, even when the nodes’ actual performance is not known in advance. Moreover, the results have shown that CERA’s online optimization stage can reduce the energy consumption by up to 37.14% without affecting the total processing time.
Read full abstract