Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Mikołaj Skrzypczyński,Piotr Muryjas

doi:10.35784/jcsi.4060

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Mikołaj Skrzypczyński, Piotr Muryjas

Open Access

https://doi.org/10.35784/jcsi.4060

Copy DOI

Journal: Journal of Computer Sciences Institute	Publication Date: Mar 20, 2024
License type: CC BY-SA 4.0

#Apache Pig #Apache Hive + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The aim of this paper is the analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. The analysis was based on comparison between both mentioned tools with use of large data set, represented by 28 million records. Research was provided with use of scripts and queries destined for Apache Hive and Apache Pig, and then executed 10 times on environment brought by created virtual machine. Those methods were performed on the same data sets for 16 times according to previously prepared research scenarios. As the conclusion, authors had observed that Apache Hive is more efficient tool, than Apache Pig.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Journal of Computer Sciences Institute

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.