Abstract
A common use pattern in the computing models of particle physics experiments is running many distributed applications that read from a shared set of data files. We refer to this data is auxiliary data, to distinguish it from (a) event data from the detector (which tends to be different for every job), and (b) conditions data about the detector (which tends to be the same for each job in a batch of jobs). Relatively speaking, conditions data also tends to be relatively small per job where both event data and auxiliary data are larger per job. Unlike event data, auxiliary data comes from a limited working set of shared files. Since there is spatial locality of the auxiliary data access, the use case appears to be identical to that of the CernVM- Filesystem (CVMFS). However, we show that distributing auxiliary data through CVMFS causes the existing CVMFS infrastructure to perform poorly. We utilize a CVMFS client feature called "alien cache" to cache data on existing local high-bandwidth data servers that were engineered for storing event data. This cache is shared between the worker nodes at a site and replaces caching CVMFS files on both the worker node local disks and on the site's local squids. We have tested this alien cache with the dCache NFSv4.1 interface, Lustre, and the Hadoop Distributed File System (HDFS) FUSE interface, and measured performance. In addition, we use high-bandwidth data servers at central sites to perform the CVMFS Stratum 1 function instead of the low-bandwidth web servers deployed for the CVMFS software distribution function. We have tested this using the dCache HTTP interface. As a result, we have a design for an end-to-end high-bandwidth distributed caching read-only filesystem, using existing client software already widely deployed to grid worker nodes and existing file servers already widely installed at grid sites. Files are published in a central place and are soon available on demand throughout the grid and cached locally on the site with a convenient POSIX interface. This paper discusses the details of the architecture and reports performance measurements.
Highlights
Fermilab has several physics experiments including NOvA, MicroBooNE, and the Dark Energy Survey that have distributed applications that need to read from a shared set of data files
This paper describes a new approach to the problem that still uses CVMFS software but takes advantage of a CVMFS client feature called "alien cache" to cache data on a site’s existing highbandwidth data servers
This design is only intended for relatively large data files as storage elements are generally engineered for large files and there is extra network traffic for clients for every alien cache access
Summary
This content has been downloaded from IOPscience. Please scroll down to see the full text. Ser. 664 042012 (http://iopscience.iop.org/1742-6596/664/4/042012) View the table of contents for this issue, or go to the journal homepage for more. Download details: IP Address: 131.225.23.169 This content was downloaded on 08/01/2016 at 15:04. Please note that terms and conditions apply. Operated by Fermi Research Alliance, LLC under Contract No De-AC02-07CH11359 with the United States Department of Energy. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015) IOP Publishing. Journal of Physics: Conference Series 664 (2015) 042012 doi:10.1088/1742-6596/664/4/042012. D Dykstra, B Bockelman, J Blomer, K Herner, T Levshina and M Slyz Scientific Computing Division, Fermilab, Batavia, IL 60510, USA 2 University of Nebraska-Lincoln, Lincoln, NE 68588, USA 3 PH-SFT Department, CERN, Geneva, Switzerland
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have