Abstract

In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B + → J/ψϕK + . GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.

Highlights

  • Introduction toGooFit in two servers, one equipped with two nVidia TeslaK20 and 32 cores (16 + 16 by Hyper-Threading) and the other with one nVidia TeslaK40 and 40 (20 + 20) cores [7].In order to test the computing capabilities of GPUs with respect to CPU cores a high statistic toy MC technique has been implemented both in GooFit and respect to one GooFit (RooFit) frameworks to estimate the local statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψφ invariant mass in the three-body decay B+ → J/ψφK+ [8]

  • A first performance comparison can be obtained on the server hosting the two TeslaK20; using 2 GooFit/Multi Process Server (MPS) jobs on the 2 GPUs and 30 CPUs against one PROOF-Lite job using 30 CPUs, the speed-up is 45 and pretty stable, as expected, with respect to a strongly varying number of MC toys

  • F (∆χ2)d(∆χ2) (57.7 · 106)−1 1.73 · 10−8 (1). This corresponds to the statistical significance Zσ = Φ−1(1 − P)σ 5.52σ, through the inverse function of the cumulative distribution of the standard Gaussian, that is compatible with the lower limit of 5σ quoted in [8] on the basis of 50.5M of MC toys obtained by means of RooFit

Read more

Summary

Pseudo-experiments for p-value estimation and GooFit performances

In order to test the computing capabilities of GPUs with respect to CPU cores a high statistic toy MC technique has been implemented both in GooFit and RooFit frameworks to estimate the local statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψφ invariant mass in the three-body decay B+ → J/ψφK+ [8]. A first performance comparison can be obtained on the server hosting the two TeslaK20; using 2 GooFit/MPS jobs on the 2 GPUs and 30 CPUs (each running 15 processes) against one PROOF-Lite job using 30 CPUs, the speed-up is 45 and pretty stable, as expected, with respect to a strongly varying number of MC toys. The comparison is between one RooFit/PROOF-Lite job using 16 workers (on 16 CPU cores) and one GooFit/MPS job running 16 simultaneous processes on a single TK40 or TK20. This corresponds to the statistical significance Zσ = Φ−1(1 − P)σ 5.52σ, through the inverse function of the cumulative distribution of the standard Gaussian, that is compatible with the lower limit of 5σ quoted in [8] on the basis of 50.5M of MC toys obtained by means of RooFit

Exploring the applicability limits of Wilks theorem
Future developments
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call