Abstract

Abstract This chapter considers the ubiquitous position of commodity clusters in providing HPC capabilities to the scientific community, and the many issues faced by organizations when deciding how best to procure, maintain, and maximize the usage of such a resource. With a focus on developments within the UK, we provide an overview of the current high‐performance computing (HPC) landscape from both the customer and supplier perspective. Historically HPC provision in the UK has been focused on one or two leading‐edge national facilities that have helped the UK to develop and maintain an internationally competitive position in research using HPC. This HPC dynamic has, however, changed dramatically in the period 2005–2008 with the major injection of funding into University HPC sector through the Science Research Investment Fund (SRIF). This sector is now the major provider of HPC resources to the UK research base, with the capability of the sector increased 100‐fold. Our primary focus lies on the role of HPC Integrators in supplying resources into this sector, and the challenges faced by the HPC service providers themselves in sustaining and growing these activities. The host sites through partnership with the selected integrator aim to maximize this entire process, from procurement through system installation and subsequent lifetime of the service. We ask whether those integrators based primarily in the UK have the ability to provide the necessary level of expertise required in all phases of the process, from procurement to ongoing support of the resource throughout its lifecycle. We consider how current HPC technology roadmaps might impinge on the role of integrators in responding to the undoubted challenges that lie ahead. Crucial issues when considering potential integrator involvement include both size of the hardware solution, that is, number of cores, number of nodes, and the ongoing robustness of open‐source software solutions that might be deployed on these platforms. Informed by developments over the past 24 months associated with the deployment of systems funded under SRIF, we provide an in‐depth analysis of the current status and capability of a number of the leading HPC Integrators within the UK. Our primary attention is given to the three major companies who now supply the academic community and hence are well known to us—Streamline Computing, ClusterVision, and OCF. Seven other integrators are also considered, albeit with less rigor. Consideration is also given to several of the Tier‐1 suppliers of clusters. In reviewing the status of commodity‐based systems in scientific and technical computing, systems representative of those supported by the integrators, we consider how an organization might best decide on the optimum technology to deploy against its intended workload. We outline our cluster performance evaluation work that uses a variety of synthetic and application‐based floating‐point metrics to inform this question. Our analysis relies on performance measurements of application independent tests (microbenchmarks) and a suite of scientific applications that are in active use on many large‐scale systems. The microbenchmarks we used provide information on the performance characteristics of the hardware, specifically memory bandwidth and latency, and intercore/interprocessor communication performance. These measurements have been extensively used to provide insight into application performance, with the scientific applications used being taken from existing workloads within the SRIF HPC sector, representing various scientific domains and program structures—molecular dynamics, computational engineering, and materials simulation to name a few. The chapter concludes with a 10‐point summary of important considerations when procuring HPC clusters, particularly those in the mid‐to‐high‐end range.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call