Using OpenMP for HEP framework algorithm scheduling

Christopher Jones,Patrick Gartung,C Doglioni,G.A Stewart,W Kamleh,L Silvestris,P Jackson,D Kim

doi:10.1051/epjconf/202024505003

Christopher Jones, Patrick Gartung + Show 6 more

Open Access

https://doi.org/10.1051/epjconf/202024505003

Copy DOI

Abstract

The OpenMP standard is the primary mechanism used at high performance computing facilities to allow intra-process parallelization. In contrast, many HEP specific software packages (such as CMSSW, GaudiHive, and ROOT) make use of Intel’s Threading Building Blocks (TBB) library to accomplish the same goal. In these proceedings we will discuss our work to compare TBB and OpenMP when used for scheduling algorithms to be run by a HEP style data processing framework. This includes both scheduling of different interdependent algorithms to be run concurrently as well as scheduling concurrent work within one algorithm. As part of the discussion we present an overview of the OpenMP threading model. We also explain how we used OpenMP when creating a simplified HEP-like processing framework. Using that simplified framework, and a similar one written using TBB, we will present performance comparisons between TBB and different compiler versions of OpenMP.

Highlights

The CMS experiment at the LHC has used a multi-thread enabled data processing framework, CMSSW [1], for large scale data processing since the start of LHC Run 2 in 2016
We have found when we communicate with High Performance Computing (HPC) specialists, they often ask why we are not using OpenMP for concurrency
The #pragma omp parallel statement starts threads which are used to process the C++ block directly following the statement. Those threads can only be used by that parallel construct. (This is relevant for the case of nested parallel blocks we will discuss in subsection 2.3.) The thread which first encountered the pragma statement, OpenMP refers to this thread as master, will join in processing the block

Summary

Introduction

The CMS experiment at the LHC has used a multi-thread enabled data processing framework, CMSSW [1], for large scale data processing since the start of LHC Run 2 in 2016. Using multiple threads allows the framework to use substantially less memory per CPU than running many single threaded jobs allowing jobs to fit within CMS’s memory constraints. This framework makes use of Intel’s Threading Building Blocks (TBB) library [2] to handle scheduling of processing tasks across the limited number of threads available to the process. The reason is the growing need for CMS to exploit resources from High Performance Computing (HPC) facilities in the coming years. These facilities typically support only OpenMP as the intra-process concurrency mechanism. This is followed by the experimental setup used to do the measurements as well as the results of the measurements

Review of OpenMP Commands

Construct: omp parallel

Construct: omp for

Nested parallel blocks

Construct: omp task

Construct: omp taskloop

Demonstrator Frameworks

Experimental Setup and Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using OpenMP for HEP framework algorithm scheduling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences

Lead the way for us

Journal: EPJ Web of Conferences	Publication Date: Jan 1, 2020
License type: CC BY 4.0

Similar Papers

Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores
Philipp Kegel ... Maraike Schellmann
-
Philipp Kegel, et. al.Philipp Kegel ... Maraike Schellmann
01 Jan 2009
01 Jan 2009

Measuring the overhead of Intel C++ Concurrent Collections over Threading Building Blocks for Gauss–Jordan elimination
Peiyi Tang
Concurrency and Computation: Practice and Experience | VOL. 24
Peiyi TangPeiyi Tang
26 Jan 2012
Concurrency and Computation: Practice and Experience | VOL. 24

TBB and the Parallel Algorithms of the C++ Standard Template Library
Michael Voss ... James Reinders
-
Michael Voss, et. al.Michael Voss ... James Reinders
01 Jan 2019
01 Jan 2019

Efficient multithreading for manycore processor: Multidimensional domain decomposition using Intel® TBB
Etienne St-Onge ... Simon Warfield
The Insight Journal | VOL. -
Etienne St-Onge, et. al.Etienne St-Onge ... Simon Warfield
18 Jul 2017
The Insight Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using OpenMP for HEP framework algorithm scheduling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EPJ Web of Conferences