Abstract

With the advances of cloud computing and virtualization technologies, running MapReduce applications over clouds has been attracting more and more attention in recent years. However, as a fundamental problem, the performance of MapReduce applications can sometimes be severely degraded due to the overheads from I/O virtualization and resource competitions among virtual machines. In this paper, we propose a dynamic block device reconfiguration algorithm in virtual MapReduce clusters, which reduces the data transfer time between virtual machines and thereby improving the performance of MapReduce applications on top of the clouds. The proposed algorithm utilizes a block device reconfiguration scheme, where a block device attached to a virtual machine can be dynamically detached and reattached to other virtual machines at runtime. This scheme allows us to move files easily across different virtual machines without any network transfers between virtual machines. This algorithm is also dynamic in a sense that it estimates the total data transfer times between virtual machines using multiple regression analysis based on CPU utilization and data size, and adaptively determines a least-cost data transfer path between a mapper virtual machine and a reducer virtual machine. We have implemented our algorithm in Hadoop MapReduce. The benchmarking results showed that the overheads incurred by transferring data from mapper virtual machines to reducer virtual machines are minimized and the execution times of MapReduce applications are shortened up to 14 %.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.