Abstract

ETL processes are the backbone component of a data warehouse, since they supply the data warehouse with the necessary integrated and reconciled data from heterogeneous and distributed data sources. However, the ETL process development, and particularly its design phase, is still perceived as a time-consuming task. This is mainly due to the fact that ETL processes are typically designed by considering a specific technology from the very beginning of the development process. Thus, it is difficult to share and reuse methodologies and best practices among projects implemented with different technologies. To the best of our knowledge, no attempt has been yet dedicated to harmonize the ETL process development by proposing a common and integrated development strategy. To overcome this drawback, in this paper, a framework for model-driven development of ETL processes is introduced. The benefit of our framework is twofold: (i) using vendor-independent models for a unified design of ETL processes, based on the expressive and well-known standard for modeling business processes, the Business Process Modeling Notation (BPMN), and (ii) automatically transforming these models into the required vendor-specific code to execute the ETL process into a concrete platform.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call