Abstract
In the ALICE experiment hundreds of users are analyzing big datasets on a Grid system. High throughput and short turn-around times are achieved by a centralized system called the LEGO trains. This system combines analysis from different users in so-called analysis trains which are then executed within the same Grid jobs thereby reducing the number of times the data needs to be read from the storage systems. The centralized trains improve the performance, the usability for users and the bookkeeping in comparison to single user analysis. The train system builds upon the already existing ALICE tools, i.e. the analysis framework as well as the Grid submission and monitoring infrastructure. The entry point to the train system is a web interface which is used to configure the analysis and the desired datasets as well as to test and submit the train. Several measures have been implemented to reduce the time a train needs to finish and to increase the CPU efficiency.
Highlights
The ALICE collaboration is recording around 10 PBs every year
It was developed to increase the CPU efficiency of the analysis jobs and to get more user analysis done with the same amount of computing resources
The LEGO train system is a workflow for the organized analysis in ALICE
Summary
The ALICE collaboration is recording around 10 PBs every year. This amount of data is stored on different storage elements all over the world. Another advantage of the LEGO trains is to hide the Grid complexity from the users They just have to define their code on a web page which provides the analysis results as soon as it is available. The time from starting the first analysis jobs until the merging is finished is called the turn around time This is the time the user has to wait for the output and it should be as short as possible. In the same time the number of users increased from 60 to 188 This shows that the LEGO trains are well accepted among the users and that they are used for most of the analysis in ALICE. The reduction in July 2013 is achieved by putting the improvements discussed in section 4 into production
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.