Abstract

Big data are everywhere as high volumes of varieties of valuable precise and uncertain data can be easily collected or generated at high velocity in various real-life applications. Embedded in these big data are rich sets of useful information and knowledge. To mine these big data and to discover useful information and knowledge, we present a data analytic algorithm in this article. Our algorithm manages, queries, and processes uncertain big data in cloud environments. More specifically, it manages transactions of uncertain big data, allows users to query these big data by specifying constraints expressing their interests, and processes the user-specified constraints to discover useful information and knowledge from the uncertain big data. As each item in every transaction in these uncertain big data is associated with an existential probability value expressing the likelihood of that item to be present in a particular transaction, computation could be intensive. Our algorithm uses the MapReduce model on a cloud environment for effective data analytics on these uncertain big data. Experimental results show the effectiveness of our data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments.

Highlights

  • Big data [1,2,3] are everywhere

  • MrCloud manages transactions of uncertain big data, allows users to query these big data by specifying anti-monotone constraints expressing their interests, and processes the user-specified constraints to discover useful information and knowledge in the form of frequent patterns from the uncertain big data

  • We evaluated our proposed data analytic algorithm MrCloud in mining user-specified constraints from uncertain big data

Read more

Summary

Introduction

Big data [1,2,3] are everywhere. They are high-veracity, high-velocity, high-value, and/or high-variety data with volumes beyond the ability of commonly-used software to manage, query, and process within a tolerable elapsed time. By applying association rule mining to valuable big market basket data, data scientists can help shop owners/managers find interesting or popular patterns that reveal customer purchase behaviour. Our key contribution is our data analytic algorithm called MrCloud—which uses the MapReduce model in cloud environments for managing, querying, and processing uncertain big data. MrCloud manages transactions of uncertain big data, allows users to query these big data by specifying anti-monotone constraints expressing their interests, and processes the user-specified constraints to discover useful information and knowledge in the form of frequent patterns from the uncertain big data.

Big Data Mining with the MapReduce Model
Constrained Mining
Uncertain Data Mining
MrCloud
Managing Uncertain Big Data
Querying Uncertain Big Data
Processing Uncertain Big Data
Evaluation Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call