Towards Green Scientific Data Compression Through High-Level I/O Interfaces

Yevhen Alforov,Anastasiia Novikova,Julian Kunkel,Michael Kuhn,Thomas Ludwig

doi:10.1109/cahpc.2018.8645921

Yevhen Alforov, Anastasiia Novikova + Show 3 more

Open Access

https://doi.org/10.1109/cahpc.2018.8645921

Copy DOI

Abstract

Every HPC system today has to cope with a deluge of data generated by scientific applications, simulations or large-scale experiments. The upscaling of supercomputer systems and infrastructures, generally results in a dramatic increase of their energy consumption. In this paper, we argue that techniques like data compression can lead to significant gains in terms of power efficiency by reducing both network and storage requirements. However, any data reduction is highly data specific and should comply with established requirements. Therefore, unsuitable or inappropriate compression strategy can utilize more resources and energy than necessary. To that end, we propose a novel methodology for achieving on-the-fly intelligent determination of energy efficient data reduction for a given data set by leveraging state-of-the-art compression algorithms and meta data at application-level I/O. We motivate our work by analyzing the energy and storage saving needs of data sets from real-life scientific HPC applications, and review the various lossless compression techniques that can be applied. We find that the resulting data reduction can decrease the data volume transferred and stored by as much as 80 % in some cases, consequently leading to significant savings in storage and networking costs.

Full Text