Abstract

Public Infrastructure-as-a-Service (IaaS) clouds abstract various details regarding the implementation of resources provided to users. For example, users are not informed about the exact physical location of their virtual machines (VMs), the specific hardware used, the number of co-resident VMs they reside with, or the workloads that co-resident VMs are running. Detecting when VMs underperform can help identify resource contention from co-resident VMs to spur their replacement. Resource utilization metrics can be used to help classify performance of runs for use in VM performance model datasets to sample the distribution of performance outcomes in the cloud. VM performance models are key to predicting the cost of bioinformatics analyses in the public cloud. This paper investigates the performance variations of running a RNA sequencing workflow in the public cloud. We examine causes of performance variations including VM provisioning, CPU heterogeneity, and resource contention. We leverage Amazon Elastic Compute Cloud (EC2) placement groups, a feature designed to help influence VM placement to help examine how VM placement impacts performance variations. As a use case, we investigate the performance of a multi-stage bioinformatics RNA sequencing (RNA-seq) analytical workflow consisting of four distinct phases, executing in 90 minutes on average using 8-core public cloud VMs. In addition, we investigate whether Linux resource utilization metrics collected by profiling workflow runs can help identify performance implications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.