Abstract

Thanks to the advances achieved in the last decade, the lack of adequate technologies to deal with Big Data characteristics such as Data Volume is no longer an issue. Instead, recent studies highlight that one of the main Big Data issues is the lack of expertise to select adequate technologies and build the correct Big Data architecture for the problem at hand. In order to tackle this problem, we present our methodology for the generation of Big Data pipelines based on several requirements derived from Big Data features that are critical for the selection of the most appropriate tools and techniques. Thus, thanks to our approach we reduce the required know-how to select and build Big Data architectures by providing a step-by-step methodology that leads Big Data architects into creating their Big Data Pipelines for the case at hand. Our methodology has been tested in two use cases.

Highlights

  • I N recent years, the large number of publications on Big Data techniques [1]–[4], technologies [5]–[7] and applications [8]–[10], highlights the importance of the Big Data phenomenon in the field of data processing and analysis

  • This methodology is based on our previous research [16], where we proposed a first version of the methodology and evaluated its application through an Internet of Things (IoT) case study with application in Smart Cities

  • In this new version of the methodology, we have focused our research on the generation of the Big Data architecture from the requirements of the target Big Data application and we have updated its application to the current technologies and challenges of Big Data, previously commented

Read more

Summary

INTRODUCTION

I N recent years, the large number of publications on Big Data techniques [1]–[4], technologies [5]–[7] and applications [8]–[10], highlights the importance of the Big Data phenomenon in the field of data processing and analysis. Several methodological approaches have been published [1]–[4], [15] with the aim to provide effective solutions to the above-mentioned problems Most of these proposals are based on the analysis of the requirements derived from the 5V’s or characteristics of Big Data, which we consider essential to describe a Big Data scenario and a key factor for the choice of the most appropriate techniques and tools for the Big Data Pipeline. In this paper we present an iterative methodology to help IT professionals with the definition and validation of Big Data architectures for analytical applications This methodology is based on our previous research [16], where we proposed a first version of the methodology and evaluated its application through an Internet of Things (IoT) case study with application in Smart Cities (data analysis of distributed Smart Meters).

RELATED WORK
REAL USE CASE
For all application A into table R do
BIG DATA PIPELINE EVALUATION
IOT USE CASE EVALUATION
Findings
CONCLUSIONS AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call