Abstract

GlideinWMS is a pilot framework to provide uniform and reliable HTCondor clusters using heterogeneous resources. The Glideins are pilot jobs that are sent to the selected nodes, test them, set them up as desired by the user jobs, and ultimately start an HTCondor schedd to join an elastic pool. These Glideins collect information that is very useful to evaluate the health and efficiency of the worker nodes and invaluable to troubleshoot when something goes wrong. This data, including local stats, the results of all the tests, and the HTCondor log files, is packed and sent to the GlideinWMS Factory. To access this information, developers and troubleshooters must exchange emails with Factory operators and dig manually into files. Furthermore, these files contain also information like email and IP addresses, and user IDs, that we want to protect and limit access to. GlideinMonitor is a Web application to make these logs more accessible and useful: it organizes the logs in an efficient compressed archive; it allows to search, unpack, and inspect them, all in a convenient and secure Web interface; via plugins like the log anonymizer, it can redact protected information preserving the parts useful for troubleshooting.

Highlights

  • The primary objective of this paper is to describe the GlideinMonitor system and to show its utility in a Glidein-based distributed High Throughput Computing system

  • We first provide some background information about the GlideinWMS [1] system and the information it collects; secondly, we describe GlideinMonitor [2] and explain how it simplifies the activities of software developers and GlideinWMS operators; and we show how the system can comply with restrictive privacy policies and still allow the work of troubleshooters and developers

  • GlideinMonitor is a very useful tool to archive and share Glidein logs and, with its anonymization plug-in, it makes it possible to follow the guidelines of Privacy-Preserving Data Publishing and comply with regulations like GDPR

Read more

Summary

Introduction

The primary objective of this paper is to describe the GlideinMonitor system and to show its utility in a Glidein-based distributed High Throughput Computing (dHTC) system. GlideinWMS is a Glidein-based Workload Management System leveraging HTCondor The Frontend monitors the user requests, selects the best resources to provide the virtual cluster for the users and requests the Factory to submit Glideins to those resources. A recent extension allows Glideins to upload log files to monitoring servers, Web servers designated by the Factory or Frontend, accepting authenticated uploads. These uploads happen during the execution of the Glidein, at the end, allowing more timely incremental updates. The Glidein monitoring infrastructure, GlideinMonitor and the Anonymization filter aim to eliminate these obstacles

GlideinMonitor
Indexer
Web server
Log Anonymization
Research
Implementation
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call