Abstract

As more social media platforms expand through our lives, the amount of data exchanged across them has sharply upsurged. Data coming from social network sites can be immensely useful for all companies for determining customer trends and increase operational efficiency to get a competitive edge. At the same time, traditional decision support systems are unable to meet the growing needs of the modern enterprise to integrate and analyze a wide variety of data generated by social networks platforms. This emergence of large amounts of data requires new techniques of data management and data storage architectures able to find information quickly in a large volume of data. In this context, a data storage concept known under the name of data lake appeared, which refers to one of the latest technologies that were introduced to address this challenge in the last period. A data lake is a large raw data repository that stores and manages all company data in raw form before integrating them into the data warehouse. In this paper, we provide a new approach to design a NoSQL data warehouse from a data lake. More precisely, we start by introducing some of the recent literature reviews on NoSQL data warehouse design approaches. Then, we describe the main concepts of a NoSQL data lake that allows storing the big data collected from social networks such as Facebook, Twitter, and Youtube. Finally, we define a set of mapping rules to integrate social media data from the data lake into the NoSQL data warehouse based on two NoSQL logical models: column-oriented and document-oriented.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call