Abstract

Contribution: Recently, real-time data warehousing (DWH) and big data streaming have become ubiquitous due to the fact that a number of business organizations are gearing up to gain competitive advantage. The capability of organizing big data in efficient manner to reach a business decision empowers data warehousing in terms of real-time stream processing. A systematic literature review for real-time stream processing systems is presented in this paper which rigorously look at the recent developments and challenges of real-time stream processing systems and can serve as a guide for the implementation of real-time stream processing framework for all shapes of data streams. Background: Published surveys and reviews either cover papers focusing on stream analysis in applications other than real-time DWH or focusing on extraction, transformation, loading (ETL) challenges for traditional DWH. This systematic review attempts to answer four specific research questions. Research Questions: 1)Which are the relevant publication channels for real-time stream processing research? 2) Which challenges have been faced during implementation of real-time stream processing? 3) Which approaches/tools have been reported to address challenges introduced at ETL stage while processing real-time stream for real-time DWH? 4) What evidence have been reported while addressing different challenges for processing real-time stream? Methodology: A systematic literature was conducted to compile studies related to publication channels targeting real-time stream processing/joins challenges and developments. Following a formal protocol, semi-automatic and manual searches were performed for work from 2011 to 2020 excluding research in traditional data warehousing. Of 679,547 papers selected for data extraction, 74 were retained after quality assessment. Findings: This systematic literature highlights implementation challenges along with developed approaches for real-time DWH and big data stream processing systems and provides their comparisons. This study found that there exists various algorithms for implementing real-time join processing at ETL stage for structured data whereas less work for un-structured data is found in this subject matter.

Highlights

  • Real-time analytics are becoming ubiquitous for several application scenarios where well-timed business decisions are extremely important

  • The novelty of our systematic literature review (SLR) is that it provides a new classification criteria, real-time stream processing research targeting channels, real-time data warehousing (DWH)/big data streaming challenges, approaches to address these challenges after validating studies empirically

  • RELATED WORK It was found that most of the existing surveys and systematic reviews do not cover publication channels approaches, challenges and solutions targeting real-time stream processing research needed in business intelligence, and focus majorly on tools used for big data analytics and DWH design approaches from social media

Read more

Summary

INTRODUCTION

Real-time analytics are becoming ubiquitous for several application scenarios where well-timed business decisions are extremely important. A broad category of applications participate in continuous generation of massive data Analysis of these streams is a big challenge where gathered data is heterogeneous and can be of any shape/nature i.e, structured, semi/unstructured, symmetrical or skewed. Real-time stream processing (refer as in-memory processing of massive data) can be generally required into two types of application domains: first where organising data. The focus of this study is to present an extensive systematic literature review (SLR) to gather different approaches for real-time stream processing for all possible application domains real-time DWH. The novelty of our SLR is that it provides a new classification criteria, real-time stream processing research targeting channels, real-time DWH/big data streaming challenges, approaches to address these challenges after validating studies empirically.

RELATED WORK
ASSESSMENT OF RQ1
ASSESSMENT OF RQ2
ASSESSMENT OF RQ3
ASSESSMENT OF RQ4
Findings
CONCLUDED DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call