Abstract

Public Infrastructure-as-a-Service (IaaS) clouds abstract the physical hardware implementation of resources provided to users. Users are not informed about the exact physical location of their virtual machines (VMs), the specific hardware used, the number of co-resident VMs they reside with, or the workloads that co-resident VMs are running. Detecting when VMs underperform can help identify resource contention from co-resident VMs to spur their replacement. In addition, resource utilization metrics may help classify performance of runs for use in VM performance model datasets that sample the distribution of performance outcomes. VM performance models are key to optimizing the cost of bioinformatics analyses in the public cloud. In this paper, we investigate performance variation of running big data genomics workflows in the public cloud. We examine causes of performance variation including VM provisioning, CPU heterogeneity, and resource contention. We leverage Amazon Elastic Compute Cloud placement groups, a feature designed to help influence VM placement on Amazon EC2 to help examine how VM placement impacts performance variation. As a use case, we investigate the performance of a multi-stage bioinformatics RNA sequencing (RNA-seq) analytical workflow consisting of four distinct phases, executing in ~90 minutes on average on 8-core public cloud VMs. In addition, we investigate whether Linux resource utilization metrics collected by profiling workflow runs can help identify performance variations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.