Incremental load is an important factor for successful data warehousing. Lack of standardized incremental refresh methodologies can lead to poor analytical results, which can be unacceptable to an organization’s analytical community. Successful data warehouse implementation depends on consistent metadata as well as incremental data load techniques. If consistent load timestamps are maintained and efficient transformation algorithms are used, it is possible to refresh databases with complete accuracy and with little or no manual checking. This paper proposes an Extract-Transform-Load (ETL) metadata model that archives load observation timestamps and other useful load parameters. The author also recommends algorithms and techniques for incremental refreshes that enable table loading while ensuring data consistency, integrity, and improving load performance. In addition to significantly improving quality in incremental load techniques, these methods will save a substantial amount of data warehouse systems resources.
Read full abstract