Abstract

Cloud computing is a relatively new form of computing, which uses virtualized resources and is dynamically scalable and is often provided as pay for use service over the Internet or Intranet or both. With increasing demand for data storage in the cloud, study of data intensive applications is becoming a primary focus. Data intensive applications are those which involve a high CPU usage, processsing large volumes of data typically in size of hundreds of gigabytes, terabytes, or petabytes. This study was conducted on Amazon's Elastic Cloud Compute (EC2) and Amazon Elastic Map Reduce (EMR) using HiBench Hadoop Benchmark Suite. HiBench is a Hadoop benchmark suite and is used for performing and evaluating Hadoop based data intensive computation on both these cloud paltforms. Both quantitative and qualitative comparison was performed on both Amazon EC2 and Amazon EMR, including a study of their pricing models and measures are suggested for future studies and research.

Highlights

  • There are three service models provided in the cloud: Infrastructure-as–a-Service (IaaS) where consumer is provided with the capability of provisioning storage, processing and networks and run arbitrary services

  • The Microsoft Excel 2010 built-in function T-TEST was used for statistical analysis of the results obtained from the benchmarks on Amazon EC2 and Amazon Elastic Map Reduce (EMR)

  • Graphs are plotted for Amazon EC2 and Amazon EMR cloud services for comparing their performance

Read more

Summary

Introduction

There are three service models provided in the cloud: Infrastructure-as–a-Service (IaaS) where consumer is provided with the capability of provisioning storage, processing and networks and run arbitrary services. Platform-as-a-Service (PaaS) wher0e consumer is provided with the capability of deploying applications on the cloud using the provider’s tools and, libraries and languages. In this model, the infrastructure is controlled by the provider and the consumer only has access to deploy applications and change configuration settings related to deployment. Software-as–a-Service (SaaS) where consumer is provided with the capability of using provider’s applications that are running on the cloud In this case, the applications are either accessible from a web interface or a program interface.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call