Bringing heterogeneity to the CMS software framework

Andrea Bocci,Vincenzo Innocente,Felice Pantaleo,Marco Rovere,David Dagenhart,Matti Kortelainen,Christopher Jones

doi:10.1051/epjconf/202024505009

Abstract

The advent of computing resources with co-processors, for example Graphics Processing Units (GPU) or Field-Programmable Gate Arrays (FPGA), for use cases like the CMS High-Level Trigger (HLT) or data processing at leadership-class supercomputers imposes challenges for the current data processing frameworks. These challenges include developing a model for algorithms to offload their computations on the co-processors as well as keeping the traditional CPU busy doing other work. The CMS data processing framework, CMSSW, implements multithreading using the Intel Threading Building Blocks (TBB) library, that utilizes tasks as concurrent units of work. In this paper we will discuss a generic mechanism to interact effectively with non-CPU resources that has been implemented in CMSSW. In addition, configuring such a heterogeneous system is challenging. In CMSSW an application is configured with a configuration file written in the Python language. The algorithm types are part of the configuration. The challenge therefore is to unify the CPU and co-processor settings while allowing their implementations to be separate. We will explain how we solved these challenges while minimizing the necessary changes to the CMSSW framework. We will also discuss on a concrete example how algorithms would offload work to NVIDIA GPUs using directly the CUDA API.

Highlights

Co-processors or computing accelerators like graphics processing units (GPU) or fieldprogrammable gate arrays (FPGA) are becoming more and more popular to keep the cost and power consumption of computing centers under control
In this paper we describe generic mechanisms to interact with non-CPU resources effectively from the Threading Building Blocks (TBB) tasks (Section 2), and to configure CPU and non-CPU algorithms in a
As a first step to gain experience, we have explored various ways for how algorithms could offload work to NVIDIA GPUs with CUDA [17]

Summary

Introduction

Co-processors or computing accelerators like graphics processing units (GPU) or fieldprogrammable gate arrays (FPGA) are becoming more and more popular to keep the cost and power consumption of computing centers under control. The CMS data processing framework (CMSSW) [11,12,13,14,15] implements multi-threading using the Intel Threading Building Blocks (TBB) [16] library utilizing tasks as units of concurrent work. While in principle non-CPU resources could be interacted with in the TBB tasks directly in a straightforward way, the non-CPU APIs typically imply blocking the calling thread. Such blocking would lead to under-utilizing the CPU. In this paper we describe generic mechanisms to interact with non-CPU resources effectively from the TBB tasks (Section 2), and to configure CPU and non-CPU algorithms in a.

Concurrent CPU and non-CPU processing

Unified configuration for CPU and non-CPU algorithms

Pattern to interact with CUDA runtime

Asynchronous execution

Sharing of resources between modules

Minimization of data movements

Summary

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EPJ web of conferences	Publication Date: Jan 1, 2020
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Bringing heterogeneity to the CMS software framework

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ web of conferences

Lead the way for us

Similar Papers

Bringing heterogeneity to the CMS software framework [Slides
Andrea Bocci ... Matti Kortelainen
-
Andrea Bocci, et. al.Andrea Bocci ... Matti Kortelainen
31 Oct 2019
31 Oct 2019

Characterizing Fine-Grain Parallelism on Modern Multicore Platform
Xuhao Chen ... Zhiying Wang
-
Xuhao Chen, et. al.Xuhao Chen ... Zhiying Wang
01 Dec 2011
01 Dec 2011

Programming Heterogeneous Multicore Systems Using Threading Building Blocks
George Russell ... Andrew Richards
-
George Russell, et. al.George Russell ... Andrew Richards
01 Jan 2010
01 Jan 2010

Improving the Task Stealing in Intel Threading Building Blocks
Shiyi Lu ... Qing Li
-
Shiyi Lu, et. al.Shiyi Lu ... Qing Li
01 Oct 2011
01 Oct 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bringing heterogeneity to the CMS software framework

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ web of conferences