Abstract

In order to efficiently achieve fault tolerance in cloud computing, large-scale data centers generally leverage remote backups to improve system reliability. Due to long-distance and expensive network transmission, the backups incur heavy communication overheads and potential errors. To address this important problem, we propose an efficient remote communication service, called Neptune. Neptune efficiently transmits massive data between long-distance data centers via a cost-effective filtration scheme. The filtration in Neptune is interpreted as eliminating redundancy and compressing similarity of files, which are generally studied independently in the existing work. In order to bridge their gap, Neptune leverages chunk-level deduplication to eliminate duplicate files and uses approximate delta compression to compress similar files. Moreover, in order to reduce the complexity and overheads, Neptune uses a locality-aware hashing to group similar files and proposes shortcut delta chains for fast remote recovery. We have implemented Neptune between two data centers and their distance is more than 1200 km via a 2 Mb/s network link. We examine the Neptune performance using real-world traces of Los Alamos National Laboratory (LANL), EMC, and Campus collection. Compared with state-of-the-art work, experimental results demonstrate the efficiency and efficacy of Neptune.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call