Abstract
The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multipurpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Sezione di Torino of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing use cases, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others. Furthermore, it will serve as a platform for R&D activities on computational technologies themselves, with topics ranging from GPU acceleration to Cloud Computing technologies. A heterogeneous and reconfigurable system like this poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, containers, virtual farms, jobs, interactive bare-metal sessions, etc. This work describes some of the use cases that prompted the design and construction of the HPC cluster, its architecture and resource provisioning model, along with a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.
Highlights
Flat Flat3. Benchmarking the resources Because of the very diverse range of use cases and applications, benchmarking the resources poses a challenge in the selection of the tools to be used, and even on the very metrics to be measured
A team from the Chemistry Department of the University of Torino is developing and maintaining CRYSTAL [2], a widely-used software for Ab-initio Solid State Chemistry
The R-based code is distributed as a set of Docker containers that run in sequence, each using the output of the previous one
Summary
3. Benchmarking the resources Because of the very diverse range of use cases and applications, benchmarking the resources poses a challenge in the selection of the tools to be used, and even on the very metrics to be measured. Filesystems performance Because of their different uses, the two storage subsystems were tested for different metrics: the Scratch system was tested for random I/O access and metadata handling performance, while the Archive was tested for sequential I/O only. In the I/O test, the performance of both storage subsystems was measured at the same time, to somehow simulate a realistic working condition
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.