Abstract

MapReduce is the preferred cloud computing framework used in large data analysis and application processing. MapReduce frameworks currently in place suffer performance degradation due to the adoption of sequential processing approaches with little modification and thus exhibit underutilization of cloud resources. To overcome this drawback and reduce costs, we introduce a Parallel MapReduce (PMR) framework in this paper. We design a novel parallel execution strategy of Map and Reduce worker nodes. Our strategy enables further performance improvement and efficient utilization of cloud resources execution of Map and Reduce functions to utilize multicore environments available with computing nodes. We explain in detail makespan modeling and working principle of the PMR framework in the paper. Performance of PMR is compared with Hadoop through experiments considering three biomedical applications. Experiments conducted for BLAST, CAP3, and DeepBind biomedical applications report makespan time reduction of 38.92%, 18.00%, and 34.62% considering the PMR framework against Hadoop framework. Experiments' results prove that the PMR cloud computing platform proposed is robust, cost-effective, and scalable, which sufficiently supports diverse applications on public and private cloud platforms. Consequently, overall presentation and results indicate that there is good matching between theoretical makespan modeling presented and experimental values investigated.

Highlights

  • Delivery model of data intensive applications/services on cloud platforms is the new paradigm

  • Based on the results presented, it is evident that execution of BLAST sequence alignment algorithm on the proposed Parallel MapReduce (PMR) yields superior results when compared to similar experiments conducted on the existing Hadoop framework

  • To lower execution times and enable effective utilization of cloud resources, this paper proposes a PMR cloud computing platform

Read more

Summary

Introduction

Delivery model of data intensive applications/services on cloud platforms is the new paradigm. To address issues related to sequential execution in [13], a Cloud MapReduce (CMR) framework is discussed They developed a parallelized model by adopting a pipelining execution approach to process the streaming and batch data. Their cloud based MapReduce model supports parallelism between Map and Reduce phases and among individual jobs. (i) Makespan modeling and design of PMR cloud framework (ii) Parallel execution strategy of the Map and Reduce phase (iii) Maximizing cloud resource utilization by computing on multicore environments in Map and Reduce (iv) Performance evaluation on state-of-the-art biomedical applications like BLAST, CAP3, and DeepBind (v) Experiments considering diverse cloud configurations and varied application configuration (vi) Correlation between theoretical makespan model and experimental values.

Literature Review
The Proposed PMR Framework
PMR Makespan Model
Performance Evaluation
Findings
Conclusion and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call