Self-service infrastructure container for data intensive application

Ibrahim K Musa,Stuart D Walker,Andrew P Harrison,Anne M Owen

doi:10.1186/2192-113x-3-5

Abstract

Cloud based scientific data management - storage, transfer, analysis, and inference extraction - is attracting interest. In this paper, we propose a next generation cloud deployment model suitable for data intensive applications. Our model is a flexible and self-service container-based infrastructure that delivers - network, computing, and storage resources together with the logic to dynamically manage the components in a holistic manner. We demonstrate the strength of our model with a bioinformatics application. Dynamic algorithms for resource provisioning and job allocation suitable for the chosen dataset are packaged and delivered in a privileged virtual machine as part of the container. We tested the model on our private internal experimental cloud that is built on low-cost commodity hardware. We demonstrate the capability of our model to create the required network and computing resources and allocate submitted jobs. The results obtained shows the benefits of increased automation in terms of both a significant improvement in the time to complete a data analysis and a reduction in the cost of analysis. The algorithms proposed reduced the cost of performing analysis by 50% at 15 G B of data analysis. The total time between submitting a job and writing the results after analysis also reduced by more than 1 h r at 15 G B of data analysis.

Highlights

Large scale data are increasingly generated from a wide variety of sources such as scientific experiments and monitoring devices
LJF-KQ algorithm To provision the required VMs, we proposed a variation of [30] as Largest Job First on the K Queues (LJF-KQ) strategy
LJF-KQ-L algorithm The Largest Job First on K Queues with Lookup (LJFKQ-L) algorithm is a variation of LJF-KQ with lookup for finish times

Summary

Introduction

Large scale data are increasingly generated from a wide variety of sources such as scientific experiments and monitoring devices. Service layer This enables the functionalities for container-based cloud service creation It performs the provisioning of resources (e.g. CPU time, memory, storage, and network bandwidth) to a vCell, interacts with the underlying layer, and performs additional global scheduling. Implementing the data analysis container Our work considers self-service and dynamic algorithms for initial VM size, VM provisioning, and job allocation. For other application types with different requirements other than memory, a different scheme is required In both cases, knowledge of the domain and the historical output trace of previously executed related jobs are valuable inputs to the inference mechanism to determine the relationship between the task requirements and capacity (e.g. bandwidth, memory, CPU) of the VMs. Step 3: Creation of virtual network (VIF) for the created VM.

15: Return Type

26: Return V

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cloud Computing: Advances, Systems and Applications	Publication Date: Jan 1, 2014
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Self-service infrastructure container for data intensive application

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cloud Computing: Advances, Systems and Applications

Lead the way for us

Similar Papers

HMSPKmerCounter: Hadoop based Parallel, Scalable, Distributed Kmer Counter for Large Datasets
S Saravanan ... Prashanth Athri
-
S Saravanan, et. al.S Saravanan ... Prashanth Athri
01 Oct 2018
01 Oct 2018

Sequential and parallel scheduling of dynamic bandwidth-intensive scientific workflows in elastic optical networks
Juzi Zhao ... Anisha Joseph
-
Juzi Zhao, et. al.Juzi Zhao ... Anisha Joseph
01 May 2017
01 May 2017

RTuner
Ripon Patgiri ... Rajdeep Das
-
Ripon Patgiri, et. al.Ripon Patgiri ... Rajdeep Das
08 Jan 2018
08 Jan 2018

Virtual Network Embedding Based on Computing, Network, and Storage Resource Constraints
Peiying Zhang ... Haipeng Yao
IEEE Internet of Things Journal | VOL. 5
Peiying Zhang, et. al.Peiying Zhang ... Haipeng Yao
01 Oct 2018
IEEE Internet of Things Journal | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-service infrastructure container for data intensive application

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cloud Computing: Advances, Systems and Applications