Managing Rich Metadata in High-Performance Computing Systems Using a Graph Model

Dong Dai,Robert Ross,Wei Zhang,Philip Carns,John Jenkins,Yong Chen

doi:10.1109/tpds.2018.2887380

Abstract

High-performance computing (HPC) systems generate huge amounts of metadata about different entities such as jobs, users, and files. Existing systems can efficiently record and manage part of these metadata, mainly the POSIX metadata of data files (e.g., file size, name, and permissions mode). But another important set of metadata, referred to as “rich” metadata in this study, which record not only wider range of entities (e.g., running processes and jobs) but also more complex relationships between them, are mostly missing in current HPC systems. Yet such rich metadata are critical for supporting many advanced data management functions such as identifying data sources and parameters behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. To uniformly and efficiently manage the rich metadata generated in HPC systems, We propose to utilize a graph model in this study. We identify the key challenges of implementing such a graph-based HPC rich metadata management system and present GraphMeta, a graph-based rich metadata management system designed and optimized for HPC platforms, to tackle these challenges. Extensive evaluations on both synthetic and real HPC metadata workloads show its advantages in both performance and scalability compared with existing solutions.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jul 1, 2019
Citations: 37	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

Managing Rich Metadata in High-Performance Computing Systems Using a Graph Model

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Similar Papers

Design of robust scheduling methodologies for high performance computing

-

01 Jan 2019
01 Jan 2019

GraphMeta: A Graph-Based Engine for Managing Large-Scale HPC Rich Metadata
Dong Dai ... Wei Zhang
-
Dong Dai, et. al.Dong Dai ... Wei Zhang
01 Sep 2016
01 Sep 2016

Anomaly Detection and Anticipation in High Performance Computing Systems
Andrea Borghesi ... Michela Milano
IEEE Transactions on Parallel and Distributed Systems | VOL. 33
Andrea Borghesi, et. al.Andrea Borghesi ... Michela Milano
22 May 2021
IEEE Transactions on Parallel and Distributed Systems | VOL. 33

FT-PBLAS: PBLAS-Based Fault-Tolerant Linear Algebra Computation on High-performance Computing Systems
Yanchao Zhu ... Yi Liu
IEEE Access | VOL. 8
Yanchao Zhu, et. al.Yanchao Zhu ... Yi Liu
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Managing Rich Metadata in High-Performance Computing Systems Using a Graph Model

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems