Abstract

The increasing performance gap between computation and I/O creates huge data management challenges for simulation-based scientific discovery. Data reduction, among others, is deemed to be a promising technique to bridge the gap through reducing the amount of data migrated to persistent storage. However, the reduction performance is still far from what is being demanded from production applications. To this end, we propose a new methodology that aggressively reduces data despite the substantial loss of information, and re-computes the original accuracy on-demand. As a result, our scheme creates an illusion of a fast and large storage medium with the availability of high-accuracy data. We further design a load-aware data reduction strategy that monitors the I/O overhead at runtime, and dynamically adjusts the reduction ratio. We verify the efficacy of our methodology through adaptive mesh refinement, a popular numerical technique for solving partial differential equations. We evaluate data reduction and selective data re-computation on Titan, using a real application in FLASH and mini-applications in Chombo. To clearly demonstrate the benefits of re-computation, we compare it with other state-of-the-art data reduction methods including SZ, ZFP, FPC and deduplication, and it is shown to be superior in both write and read speeds, particularly when a small amount of data (e.g., 1%) need to be retrieved, as well as reduction ratio. Our results confirm that data reduction and selective data re-computation can 1) reduce the performance gap between I/O and compute via aggressively reducing AMR levels, and more importantly 2) can recover the target accuracy efficiently for AMR through re-computation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.