Abstract

AbstractBenchmarking data warehouses is a means to evaluate the performance of systems and the impacts of different technical choices. Developed on relational models which have been for a few years the most used to support classical data warehousing applications such as Star Schema Benchmark (SSB). SSB is designed to measure performance of database products when executing star schema queries. As the volume of data keeps growing, the types of data generated by applications become richer than before. As a result, traditional relational databases are challenged to manage big data. Many IT companies attempt to manage big data challenges using a NoSQL (Not only SQL) database, and may use a distributed computing system. NoSQL databases are known to be non-relational, horizontally scalable, distributed. We present in this paper a new benchmark for columnar NoSQL data warehouse, namely CNSSB (Columnar NoSQL Star Schema Benchmark). CNSSB is derived from SSB and allows generating synthetic data and queries set to evaluate column-oriented NoSQL data warehouse. We have implemented CNSSB under HBase column-oriented database management system (DBMS), and apply its charge of queries to evaluate performance between two SQL skins, Phoenix and HQL (Hive Query Language). That allowed us to observe a better performance of Phoenix compared to HQL.KeywordsData warehousescolumnar databasesdecisional benchmark

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call