Abstract

The CMS collaboration at the CERN LHC has made more than one petabyte of open data available to the public, including large parts of the data which formed the basis for the discovery of the Higgs boson in 2012. Apart from their scientific value, these data can be used not only for education and outreach, but also for software development. However, in their original format, the data cannot be accessed easily without experiment-specific knowledge and skills. Work is presented that allows to set up open analyses that are performed close to the published ones, but which meet minimum requirements for experiment-specific knowledge and software. The suitability of this approach for education and outreach is demonstrated with analyses that have been made fully accessible to the public via the CERN Open Data portal. Further, the value of these data for software development and as basis for benchmarks of analysis software under realistic conditions of a high-energy physics experiment is discussed.

Highlights

  • The CMS collaboration has recently published on the CERN Open Data portal [1] a new batch of open data to the public [2]

  • The release increases the volume of the open data to more than two petabyte including large parts of the data used for the discovery of the Higgs boson in 2012

  • We can expect a continuous growth of these resources due to the CMS Open Data policy [3], which states that the collaboration commits to releasing 100 % of its analysable data within ten years of collecting them, making CMS Open Data an invaluable resource for open science

Read more

Summary

Introduction

The CMS collaboration has recently published on the CERN Open Data portal [1] a new batch of open data to the public [2]. The CMS Open Data releases are already today basis of scientific publications, naturally in the field of particle physics [7, 8] but as well in studies related to data science and machine learning [9, 10]. Besides the purely scientific use-cases, open data is valuable for education and outreach being already actively used around the world [11, 12], especially for attracting young people to CMS and high-energy physics. This paper presents an approach to ease the usage of open data for education and outreach and points out the additional value of such resources for software development in high-energy physics.

Data-format of CMS Open Data
Analysis of the di-muon spectrum
Analysis of Higgs boson decays to two tau leptons
Usage for software development
Findings
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call