Abstract

Data warehouse store historical records on which analysis queries are executed. Data warehouse are populated from the transaction database through a process called as Extract Transform and Load (ETL). Extract is to identify the transaction database records to be moved to the data warehouse. Transform is to make the records normalized to suit the data warehouse environment. Load is to actually store the records in the data warehouse. Today business wants to perform real time analysis of data. Availability of updated records in the data warehouse is necessary for real time analysis. This could be achieved through the process of near real time (ETL), i.e., to improve existing ETL process. Improvement to existing ETL could be achieved in many ways such as to increase the frequency of loads, to identify relevant changes that are required for analysis and move only those changes. Only through increasing the frequency of load would not be useful in many cases. Hence we should identify the changes and then move if they are useful for analysis making it smart ETL. In this paper we study three such techniques. First one is to create a replica of dimension table in data warehouse and move the changes to the replica this reduces the query response time. Second is to load the data in parallel so that loading time could be reduced. Third is to identify changes and trigger the loading process. We have used the first approach to create replica and loaded the data in parallel and observed that by loading in parallel not only does the loading time reduces but also the query response time. Then in the first approach is the query response time reduces to a certain limit methods suggest to move the data from replica dimensions to the original dimensions. Here we bring in right time trigger to move instead of just the query response time. We found that when we have a combined approach to query response time and right time trigger the number of moves are reduced.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.