Abstract
The CERN Tape Archive (CTA) provides a tape backend to disk systems and, in conjunction with EOS, is managing the data of the LHC experiments at CERN.Magnetic tape storage offers the lowest cost per unit volume today, followed by hard disks and flash. In addition, current tape drives deliver a solid bandwidth (typically 360MB/s per device), but at the cost of high latencies, both for mounting a tape in the drive and for positioning when accessing non-adjacent files. As a consequence, the transfer scheduler should queue transfer requests before the volume warranting a tape mount is reached. In spite of these transfer latencies, user-interactive operations should have a low latency.The scheduling system for CTA was built from the experience gained with CASTOR. Its implementation ensures reliability and predictable performance, while simplifying development and deployment. As CTA is expected to be used for a long time, lock-in to vendors or technologies was minimized.Finally, quality assurance systems were put in place to validate reliability and performance while allowing fast and safe development turnaround.
Highlights
The CERN Tape Archive (CTA), in conjunction with the EOS disk system, stores the Physics data of experiments at CERN
Unlike CERN Advanced STORage (CASTOR), CTA contains no file directory structure or disk storage system: this is the responsibility of the client disk system
Writes define the destination as a tape pool, while reads are limited to the tape where the file is located. Both CASTOR and CTA logically queue reads to individual tapes, and writes to tape pools, creating a natural grouping of requests
Summary
The CERN Tape Archive (CTA), in conjunction with the EOS disk system, stores the Physics data of experiments at CERN. Writes define the destination as a tape pool, while reads are limited to the tape where the file is located Both CASTOR and CTA logically queue reads to individual tapes, and writes to tape pools, creating a natural grouping of requests. Multiple properties of the content of the queues and drive statuses are polled to decide when a new tape mount should be started, based on data volume, request age, priority and user drive allowance. Those requirements of queue content introspection make standard message passing packages inadequate. This article will describe the new queueing system introduced with CTA, which builds on the experience gathered in CASTOR as well as modern possibilities in databases and object stores to meet those challenges
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.