Abstract

Big data ecosystems are complex data-intensive, digital–physical systems. Data-intensive ecosystems offer a number of benefits; however, they present challenges as well. One major challenge is related to the privacy and security. A number of privacy and security models, techniques and algorithms have been proposed over a period of time. The limitation is that these solutions are primarily focused on an individual or on an isolated organizational context. There is a need to study and provide complete end-to-end solutions that ensure security and privacy throughout the data lifecycle across the ecosystem beyond the boundary of an individual system or organizational context. The results of current study provide a review of the existing privacy and security challenges and solutions using the systematic literature review (SLR) approach. Based on the SLR approach, 79 applicable articles were selected and analyzed. The information from these articles was extracted to compile a catalogue of security and privacy challenges in big data ecosystems and to highlight their interdependencies. The results were categorized from theoretical viewpoint using adaptive enterprise architecture and practical viewpoint using DAMA framework as guiding lens. The findings of this research will help to identify the research gaps and draw novel research directions in the context of privacy and security in big data-intensive ecosystems.

Highlights

  • The history and relevance of big data can be backtracked to the origin of the Internet

  • We carefully selected and reviewed 79 studies (s1– s79) and identified 21 key privacy and security challenges relevant to big data ecosystems using a systematic literature review (SLR) approach

  • The rationale for choosing adaptive enterprise architecture (EA) framework is that it is an overarching framework consisting of important layers of big data ecosystem (BDE)

Read more

Summary

Introduction

The history and relevance of big data can be backtracked to the origin of the Internet. The Internet can be considered as a global network of machines comprising data and applications. The term “big data” was first used in 1999 in an academic paper, which led to further detailed characterization of big data in 2003 [1]. As the volume of data was gradually increasing, this resulted in the emergence of open-source big data technologies and applications, such as Apache Hadoop in 2005. According to a report by the frontier for innovation, competition and productivity by the McKinsey Global Institute, a typical US company with 1,000 employees could store 200 terabytes of data by 2009 [2]. The frequent use of mobile devices by businesses and consumers resulted in the explosion of the volume of daily data [3, 4].

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.