Abstract
The RIKEN Computing Center in Japan (CCJ) has been developed to make it possible analyzing huge amount of data corrected by the PHENIX experiment at RHIC. The corrected raw data or reconstructed data are transferred via SINET3 with 10 Gbps bandwidth from Brookheaven National Laboratory (BNL) by using GridFTP. The transferred data are once stored in the hierarchical storage management system (HPSS) prior to the user analysis. Since the size of data grows steadily year by year, concentrations of the access request to data servers become one of the serious bottlenecks. To eliminate this I/O bound problem, 18 calculating nodes with total 180 TB local disks were introduced to store the data a priori. We added some setup in a batch job scheduler (LSF) so that user can specify the requiring data already distributed to the local disks. The locations of data are automatically obtained from a database, and jobs are dispatched to the appropriate node which has the required data. To avoid the multiple access to a local disk from several jobs in a node, techniques of lock file and access control list are employed. As a result, each job can handle a local disk exclusively. Indeed, the total throughput was improved drastically as compared to the preexisting nodes in CCJ, and users can analyze about 150 TB data within 9 hours. We report this successful job submission scheme and the feature of the PC cluster.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.