An In-Network Aggregation Scheme for Erasure Coding Storage Systems in Data Centers

Junxu Xia,Chendie Yao,Jiangfan Li

doi:10.1109/cbd.2018.00016

Abstract

In the erasure coding storage system, when a storage node fails, it is necessary to extract data blocks from other remaining storage nodes to a new node and repair. This would lead to the incast problem at the new node. Up to now, the incast problem during the repair process has been solved relying on the path planning and resource allocation. Although these solutions improve the performance at some extent, they are still very resource-consuming. In this work, we show that the incast problem can be resolved economically via the in-network aggregation. Specifically, we assume that switches in data centers have certain data processing capabilities and can aggregate data flows. Taking the fat-tree data center as an example, we propose a set of in-network repair methods for erasure coding storage systems. The incast problem caused by repairing data blocks is solved naturally during the data transmission. Compared with prior methods, our approach not only avoids the overhead of extra path computing, but also significantly reduces the link cost of repairing data blocks while ensuring similar or faster repair speed.

Full Text