Abstract

Real-time analysis of data is the new trend to get useful insights in very less time spend on data preprocessing. Analysis of data requires the movement of data from various heterogeneous/homogenous sources to a common place known as the data warehouse. Data source for data warehouse is the transaction processing systems. Movement of data from the transactional database to the data warehouse is done using the process of extract, transform, and load (ETL). ETL previously was done during of peak hours like a night load or on weekends. The requirement of real-time analysis demands the ETL to be fast and not wait for off-peak hours. This leads to the concept of near real-time ETL, and here techniques are employed to identify the potential changed data at the transaction database and move it to the analysis database with a very minimal delay. This movement of data in real time from multiple sources in an incremental form could lead to anomalies in the data warehouse. This work discusses the various causes of anomalies and solutions to overcome them. Our main contribution is the application of loading data into temporary tables for reducing query execution time in case of overcoming refreshment anomalies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.