Abstract

Big Data is the term used for larger data sets that are very complex and not easily processed by the traditional devices. Today is the need of the new technology for processing these large data sets. Apache Hadoop is the good option and it has many components that worked together to make the hadoop ecosystem robust and efficient. Apache Pig is the core component of hadoop ecosystem and it accepts the tasks in the form of scripts. To run these scripts Apache Pig may use MapReduce or Apache Tez framework. In our previous paper we analyze how these two frameworks different from each other on the basis of some parameters chosen. We compare both the frameworks in theoretical and empirical way on the single node cluster. Here, in this paper we try to perform the analysis on multinode cluster which is installed at Amazon cloud.

Highlights

  • Apache Hadoop is among the technologies to handle Big Data and it is an open source project maintained by many people around the world [2]

  • MapReduce is the backbone of hadoop Apache Tez works for Apache Pig ecosystem and Apache Pig relies on but it is very useful in interactive this framework scenarios

  • Pig script may be written in different ways and the way of writing the script does not affects our experiment because script is common for both the frameworks

Read more

Summary

Introduction

Data on servers increased very rapidly and current technologies unable to retrieve some useful information from already stored data [1]. Apache Hadoop is among the technologies to handle Big Data and it is an open source project maintained by many people around the world [2]. Hortonworks Data Platform is an organization which provides single platform for all the hadoop components [3]. Apache Pig is among one of the core components of the hadoop ecosystem. It accepts jobs submitted in the form of scripts. In our previous paper we compare both the frameworks in both theoretical and empirical way on the basis of some parameters. In this paper we put emphasis on both theoretical empirical parameters and try to analyze that

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.