State of the Art and Future Trends in Data Reduction for High-Performance Computing

Kira Duwe ,Jakob Lüttgau ,Georgiana Mania ,Jannek Squar ,Anna Fuchs ,Michael Kuhn ,Eugen Betke ,Thomas Ludwig

doi:10.14529/jsfi200101

Kira Duwe , Jakob Lüttgau + Show 6 more

Open Access

https://doi.org/10.14529/jsfi200101

Copy DOI

Abstract

Research into data reduction techniques has gained popularity in recent years as storage capacity and performance become a growing concern. This survey paper provides an overview of leveraging points found in high-performance computing (HPC) systems and suitable mechanisms to reduce data volumes. We present the underlying theories and their application throughout the HPC stack and also discuss related hardware acceleration and reduction approaches. After introducing relevant use-cases, an overview of modern lossless and lossy compression algorithms and their respective usage at the application and file system layer is given. In anticipation of their increasing relevance for adaptive and in situ approaches, dimensionality reduction techniques are summarized with a focus on non-linear feature extraction. Adaptive approaches and in situ compression algorithms and frameworks follow. The key stages and new opportunities to deduplication are covered next. An unconventional but promising method is recomputation, which is proposed at last. We conclude the survey with an outlook on future developments.

Full Text