Abstract

In the scientific domain, extremely large amounts of data are generated by large-scale high performance computing (HPC) simulations. Storing and sending such vast volumes of data poses serious scalability and performance issues, which can be considerably mitigated by data compression techniques which significantly reduced storage size and data movement burdens. Since scientific data are being shared by scientists more and more frequently, data security methods that ensure the confidentiality, integrity, and availability of data itself are becoming increasingly important. As such, combing compression and encryption is critical to storing large-scale datasets securely. In this work, we explore how to integrate data compression and cryptography techniques as efficiently as possible for big scientific datasets in the HPC field. We perform thorough experiments using different scientific datasets with the state-of-the-art error-bounded lossy compressor - SZ - on a real-world supercomputing environment. Experiments verify that performing encryption before lossy compression (a.k.a., encr-cmpr method) may invalidate the advantage of compression algorithms. By contrast, executing encryption after lossy compression (a.k.a., cmpr-encr method) keeps not only high compression ratios but high overall execution speed. Experiments also verify that the encryption overhead under the cmpr-encr method decreases with increasing compression ratios, which means very good scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call