Extreme Compression for Large Scale Data Store

Jérôme Lauret,Gene Van Buren,Philippe Canal,Juan Gonzalez,Axel Naumann,Rafael Nuñez

doi:10.1051/epjconf/202024506024

Jérôme Lauret, Gene Van Buren + Show 4 more

Open Access

https://doi.org/10.1051/epjconf/202024506024

Copy DOI

Abstract

For the last 5 years Accelogic pioneered and perfected a radically new theory of numerical computing codenamed “Compressive Computing”, which has an extremely profound impact on real-world computer science [1]. At the core of this new theory is the discovery of one of its fundamental theorems which states that, under very general conditions, the vast majority (typically between 70% and 80%) of the bits used in modern large-scale numerical computations are absolutely irrelevant for the accuracy of the end result. This theory of Compressive Computing provides mechanisms able to identify (with high intelligence and surgical accuracy) the number of bits (i.e., the precision) that can be used to represent numbers without affecting the substance of the end results, as they are computed and vary in real time. The bottom line outcome would be to provide a state-of-the-art compression algorithm that surpasses those currently available in the ROOT framework, with the purpose of enabling substantial economic and operational gains (including speedup) for High Energy and Nuclear Physics data storage/analysis. In our initial studies, a factor of nearly x4 (3.9) compression was achieved with RHIC/STAR data where ROOT compression managed only x1.4. In this contribution, we will present our concepts of “functionally lossless compression”, have a glance at examples and achievements in other communities, present the results and outcome of our current, ongoing R&D, as well as present a high-level view of our plan to move forward with a ROOT implementation that would deliver a basic solution readily integrated into HENP applications. As a collaboration of experimental scientists, private industry, and the ROOT Team, our aim is to capitalize on the substantial success delivered by the initial effort and produce a robust technology properly packaged as an open-source tool that could be used by virtually every experiment around the world as means for improving data management and accessibility.

Highlights

IntroductionAnalyses that use these data thrive on rapid (or "live") access to data under current production, but to prior years’ accumulated data as well
Sufficiently performant live storage and networking infrastructure to deliver on those demands are costly, justifying investigations of alternative solutions to throwing money at the infrastructure
Examples include administrative procedures such as limiting the portions of datasets of interest which are made accessible on live storage, and limiting the duration of their accessibility

Summary

Introduction

Analyses that use these data thrive on rapid (or "live") access to data under current production, but to prior years’ accumulated data as well. Examples include administrative procedures such as limiting the portions of datasets of interest which are made accessible on live storage, and limiting the duration of their accessibility None of these administrative solutions are ideal as they inherently restrict access in one way or another, and waiting for access is not time nor money well spent

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2020
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Extreme Compression for Large Scale Data Store

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Similar Papers

Progress toward Accelogic compression in ROOT
Ph Canal ... I.A Cali
Journal of Physics: Conference Series | VOL. 2438
Ph Canal, et. al.Ph Canal ... I.A Cali
01 Feb 2023
Journal of Physics: Conference Series | VOL. 2438

Quark stars with ‘realistic’ equations of state
W B Fechner ... P C Joss
Nature | VOL. 274
W B Fechner, et. al.W B Fechner ... P C Joss
01 Jul 1978
Nature | VOL. 274

Region-of-interest-based ultra-low-bit-rate video coding
Wei-Jung Chien ... Glen P Abousleman
-
Wei-Jung Chien, et. al.Wei-Jung Chien ... Glen P Abousleman
03 Apr 2008
03 Apr 2008

A Prototype of an ATCA-Based System for Readout Electronics in Particle and Nuclear Physics
... Qi An
-
, et. al. ... Qi An
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extreme Compression for Large Scale Data Store

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences