Towards a web-scale data management ecosystem demonstrated by SAP HANA

Franz Faerber,Jonathan Dees,Martin Weidner,Stefan Baeuerle,Wolfgang Lehner

doi:10.1109/icde.2015.7113374

Franz Faerber, Jonathan Dees + Show 3 more

Open Access

https://doi.org/10.1109/icde.2015.7113374

Copy DOI

Abstract

Over the years, data management has diversified and moved into multiple directions, mainly caused by a significant growth in the application space with different usage patterns, a massive change in the underlying hardware characteristics, and-last but not least-growing data volumes to be processed. A solution matching these constraints has to cope with a multidimensional problem space including techniques dealing with a large number of domain-specific data types, data and consistency models, deployment scenarios, and processing, storage, and communication infrastructures on a hardware level. Specialized database engines are available and are positioned in the market optimizing a particular dimension on the one hand while relaxing other aspects (e.g. web-scale deployment with relaxed consistency). Today it is common sense, that there is no single engine which can handle all the different dimensions equally well and therefore we have very good reasons to tackle this problem and optimize the dimensions with specialized approaches in a first step. However, we argue for a second step (reflecting in our opinion on the even harder problem) of a deep integration of individual engines into a single coherent and consistent data management ecosystem providing not only shared components but also a common understanding of the overall business semantics. More specifically, a data management ecosystem provides common “infrastructure” for software and data life cycle management, backup/recovery, replication and high availability, accounting and monitoring, and many other operational topics, where administrators and users expect a harmonized experience. More importantly from an application perspective however, customer experience teaches us to provide a consistent business view across all different components and the ability to seamlessly combine different capabilities. For example, within recent customer-based Internet of Things scenarios, a huge potential exists in combining graph-processing functionality with temporal and geospatial information and keywords extracted from high-throughput twitter streams. Using SAP HANA as the running example, we want to demonstrate what moving a set of individual engines and infra-structural components towards a holistic but also flexible data management ecosystem could look like. Although there are some solutions for some problems already visible on the horizon, we encourage the database research community in general to focus more on the Big Picture providing a holistic/integrated approach to efficiently deal with different types of data, with different access methods, and different consistency requirements-research in this field would push the envelope far beyond the traditional notion of data management.

Full Text