Abstract

Big data is a broad term with numerous dimensions, most notably: big data characteristics, techniques, software systems, application domains, computing platforms, and big data milieu (industry, government, and academia). In this paper we briefly introduce fundamental big data characteristics and then present seven case studies of big data techniques, systems, applications, and platforms, as seen from academic perspective (industry and government perspectives are not subject of this publication). While we feel that it is difficult, if at all possible, to encapsulate all of the important big data dimensions in a strict and uniform, yet comprehensible language, we believe that a set of diverse case studies - like the one that is offered in this paper - a set that spreads over the principal big data dimensions can indeed be beneficial to the broad big data community by helping experts in one realm to better understand currents trends in the other realms.

Highlights

  • The principle dimensions of big data include its defining characteristics, techniques, software systems, applications, computing platforms, and big data milieu.The big data dimensions are broad and in perpetual change

  • We believe that reports like this one, presenting case studies with broad coverage of the big data realm can be beneficial for the broad big data community

  • In the field of image data mining, we developed an approach for extending the learning set of a classification algorithm with additional metadata

Read more

Summary

INTRODUCTION

The principle dimensions of big data include its defining characteristics (such as volume, velocity, variety, and veracity), techniques (such as data mining, machine learning, natural language processing, neural networks, clustering, pattern recognition, sentiment analysis, predictive modeling, supervised learning, time series analysis, to mention a few), software systems (such as Hadoop, Spark, NoSQL DBMSs), applications (such as business analytics, marketing, healthcare, research, performance optimization, security, law enforcement, transportation, and many others), computing platforms (such as clusters, NUMA in-memory database servers, and cloud computing platforms), and big data milieu (such as industry, government, academia). The big data dimensions are broad and in perpetual change. This is why the task of compiling and maintaining a specification that is rigorous yet comprehensible seems impractical. Our broad collection of case studies can potentially help experts in one big data dimension expand their understanding of other dimensions. All case studies are based on the authors’ own research projects

Semantic Enhancement of Data with Ontologies
Metadata in Image Data Mining
NoSQL versus RDBMS
From Hadoop MapReduce to Spark
Astronomy and Earth Science
Biomedical Research
PLATFORMS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call