Abstract

Business Intelligence (BI) systems are crucial for enterprise improvement. They consolidate heterogeneous data from distributed sources and input high-quality data into strategic indicators. An essential component of the data consolidation is Extraction, Transformation and Loading (ETL) which are responsible for extracting data from heterogeneous sources, transforming, restructuring and integrating them into homogenous data warehouse. Due to the deficiency of traditional ETL, the entire ETL component for massive data has decreased performance. Aiming at this challenge, we propose a novel workflow framework for parallel ETL execution based on multi-agent system. The purpose of the system is to utilize a parallel strategy to improve the efficiency of ETL process. Through research, we find that some ETL activities are often executed on the same priority or using the same input data. Based on this discovery, this paper presents a parallel ETL framework based on agent theory and multi-thread techniques. The experimental results show that the proposed approach can greatly improve the efficiency of ETL process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call