Abstract

One of the most popular methods for building analytical platforms involves the use of the concept of data lakes. A data lake is a storage system in which the data are presented in their original format, making it difficult to conduct analytics or present aggregated data. To solve this issue, data marts are used, representing environments of stored data of highly specialized information, focused on the requests of employees of a certain department, the vector of an organization’s work. This article presents a study of big data storage formats in the Apache Hadoop platform when used to build data marts.

Highlights

  • When developing analytical systems, solving the issue of storing the loaded data becomes an important task

  • One of the software tools that allows the building of both data lakes and data marts is the Apache Hadoop platform [8]

  • This paper presents a study of data storage formats in relation to the creation of data marts for the tasks of creating reports during mass testing

Read more

Summary

Introduction

When developing analytical systems, solving the issue of storing the loaded data becomes an important task. Experimental Characteristics Study of Data Storage Formats for Data Marts Development within Data Lakes. One of the software tools that allows the building of both data lakes and data marts is the Apache Hadoop platform [8].

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call