Abstract
Before the arrival of the Big Data era, data warehouse (DW) systems were considered the best decision support systems (DSS). DW systems have always helped organizations around the world to analyse their stored data and use it in making decisive decisions. However, analyzing and mining data of poor quality can give the wrong conclusions. Several data quality (DQ) problems can appear during a data warehouse project like missing values, duplicates values, integrity constrains issues and more. As a result, organizations around the world are more aware of the importance of data quality and invest a lot of money in order to manage data quality in the DW systems. On the other hand, with the arrival of BD, new challenges have to be considered like the need for collecting the most recent data and the ability to make real-time decisions. This article provides a survey about the exiting techniques to control the quality of the stored data in the DW systems and the new solutions proposed in the literature to face the new Big Data requirements.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have