In this paper, we qualitatively and quantitatively discuss the design choices, production experience, and lessons in building the Elastic Block Storage ( EBS ) at Alibaba Cloud over the past decade. To cope with hardware advancement and users’ demands, we shift our focus from design simplicity in EBS1 to high performance and space efficiency in EBS2 , and finally reducing network traffic amplification in EBS3 . In addition to the architectural evolutions, we also summarize development lessons and experiences as four topics, including: (i) achieving high elasticity in latency, throughput, IOPS and capacity; (ii) improving availability by minimizing the blast radius of individual, regional, and global failure events; (iii) identifying the motivations and key tradeoffs in various hardware offloading solutions; and (iv) identifying the pros/cons of alternative solutions and explaining why seemingly promising ideas would not work in practice.
Read full abstract