Abstract

"Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.