Abstract

Currently companies in the world have focused on the Big Data business which has become an invaluable tool in assisting business processes and data analysis. SQL-on-Hadoop is a small part of the Big Data Platform that has been developed to date. Our research implements Big Data Platform on Cloudera and Hortonworks using TPC-H Benchmarks on SQL-on-Hadoop systems and evaluates the characteristics and performance of query processing machines in each scenario applied to each platform. We focuses on evaluating the two Big Data Platforms, Cloudera and Hortonworks, to determine the advantages and disadvantages of each platform based on the TPC-H Benchmark that has been recognized as a Decision Support System to compare the two with four different scenario that run on the same configurations. The results obtained are Cloudera with Impala can process queries with a ratio of up to 41x faster than LLAP-Tez and 200x faster than Hive-Spark.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call