Big companies that have many branches in different locations often have difficulty with analyzing transaction processes from each branch. The problem experienced by the company management is the rapid delivery of massive data provided by the branch to the head office so that the analysis process of the company's performance becomes slow and inaccurate. The results of this process used as a consideration in decision making which produce the right information if the data is complete and relevant. The right method of massive data collection is using the data warehouse approach. Data warehouse is a relational database designed to optimize queries in Online Analytical Processing (OLAP) from the transaction process of various data sources that can record any changes in data that occur so that the data becomes more structured. In applying the data collection, data warehouse has extracted, transform, and load (ETL) steps to read data from the Online Transaction Processing (OLTP) system, change the form of data through uniform data structures, and save to the final location in the data warehouse. This study provides an overview of the solution for implementing ETL that can work automatically or manually according to needs using the Python programming language so that it can facilitate the ETL process and can adjust to the conditions of the database in the company system.
Read full abstract