Abstract

With the advancement of modern technologies, a large amount of data generated from many users, devices, and applications is typical. This large amount of data is called Big Data. While the traditional approaches to preprocess, store, and analyze the data are based on data warehouse, preprocessing the massive scale of Big Data is somewhat costly in terms of computations and money. Hence, the alternative concept of Data Lakes originated, which can store raw data of any type. Both data warehouses and data lakes can be considered as methods of storing and processing Big Data. However, data lakes are often considered a panacea for Big Data problems. The main challenges of Big Data that can be solved by data lake, which are storing and processing, analyzing heterogeneous data sources, either structured, semi-structured, and unstructured. Also, data privacy can be considered with data lake models to ensure the data security and privacy part. Although the data lake provides an opportunity for business value by analyzing and predicting valuable information, this also results in cyber safety issues for many enterprises, including health care, defense, and finance. This chapter introduces data lakes with their components and associated challenges in processing and storing Big Data to address these problems. This chapter also presents the cyber safety and security issues relating to data lakes for the topmost targeted enterprises.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call