Abstract

Big data systems are being increasingly adopted by the enterprises exploiting big data applications to manage data-driven process, practices, and systems in an enterprise wide context. Specifically, big data systems and their underlying applications empower enterprises with analytical decision making (e.g., recommender/decision support systems) to optimize organizational productivity, competitiveness, and growth. Despite these benefits, big data applications face some challenges that include but not limited to security and privacy, authenticity, and reliability of critical data that may result in propagation of false information across systems. Data provenance as an approach and enabling mechanism (to identify the origin, manage the creation, and track the propagation of information etc.) can be a solution to above mentioned challenges for data management in an enterprise context. Data provenance solution(s) can help stakeholders and enterprises to assess the quality of data along with authenticity, reliability, and trust of information on the basis of identity, reproducibility and integrity of data. Considering the wide spread adoption of big data applications and the needs for data provenance, this paper focuses on (i) analyzing state-of-the-art for holistic presentation of provenance in big-data applications (ii) proposing a bio-inspired approach with underlying algorithm that exploits human thinking approach to support data provenance in Wireless Sensor Networks (WSNs). The proposed ‘Think-and-Share Optimization’ (TaSO) algorithms modularizes and automates data provenance in WSNs that are deployed and operated in enterprises. Evaluation of TaSO algorithm demonstrates its efficiency in terms of connectivity, closeness to the sink node, coverage, and execution time. The proposed research contextualizes bio-inspired computation to enable and optimize data provenance in WSNs. Future research aims to exploit machine learning techniques (with underlying algorithms) to automate data provenance for big data systems in networked environments.

Highlights

  • The provenance of an object or data includes information about the ownership, source, transformation and evolution of data or object during their life span [1]

  • The term data provenance, as per the Encyclopedia of Database Systems, formally refers to „a record trail that accounts for the origin of a piece of data together with an explanation of how and why it got to the present place [33]

  • This paper addresses the challenge of node trust in Wireless Sensor Networks (WSNs) that transmit large amount of data in an enterprise context

Read more

Summary

Introduction

The provenance of an object or data includes information about the ownership, source, transformation and evolution of data or object during their life span [1]. Provenance information enhances the data trustworthiness (identification of data sources) and support data compliance (policies for data processing), to ensure accountability and compliance [2]. Data provenance is a well-known research area within database and data mining. It considers the problem of identifying the origin, the creation, as well as the propagation processes of data [4]. It may be defined as the process of detecting the lineage and the derivation of data and data objects [5]. In the case of security violations, a system administrator should be able to identify the origination of the error, in addition to its causes and impacts [6]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.