SDN-based and multitenant-aware resource provisioning mechanism for cloud-based big data streaming

Cleverton Vicentini,Altair Santin,Eduardo Viegas,Vilmar Abreu

doi:10.1016/j.jnca.2018.11.005

Abstract

Cloud computing provides elastic on-demand resource allocation, enabling big data systems to process large amounts of streaming data in real time. However, a shared cloud infrastructure (multitenant at the hypervisor level) may reduce system performance or even resource availability, particularly when big data processing demands significantly increase through concurrent task allocations on the same physical hardware. Such situations are not easily detectable from the tenant's perspective, because the tenant may suffer from poor performance without knowing why, as the infrastructure is not under the tenant's control. Moreover, as task processing demand changes over time, the available infrastructure may be insufficient owing to increased processing load or multitenant interference. This paper presents a multitenant-aware resource provisioning mechanism that is independent of any hypervisor and can perform task scheduling and dynamic ongoing task rescheduling for big data streaming while considering the state of each virtual machine (VM). Moreover, the proposed mechanism ensures load balancing through several cloud-based clusters of VMs using a software-defined network (SDN). The prototype was implemented using Apache Storm (big data), Helion Eucalyptus (cloud computing), and Floodlight (SDN). The evaluation shows that when the resources are under multitenant interference, our proposal results in an improvement of 50.1% for CPU-bound tasks, 62.3% for disk-bound tasks, and 43.8% for network-bound tasks. In addition, the load balancer forwarded 72.04% of the load to a fully available cluster, meaning that our mechanism can realize a 22.04% improvement in effectiveness over traditional approaches.

Full Text