Abstract
New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modeling of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Moreover, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.