Abstract

In recent years, World Wide Web has emerged as the most promising external data source for organizations’ Data Warehouses for valuable insights required in comprehensive decision making to gain a competitive edge. However, when the Data Warehouse uses external data sources from the Web without quality evaluation, it can adversely impact its quality. Quality models have been proposed in the research literature to evaluate and select Web Data sources for their integration in a Data Warehouse. However, these models are only conceptually proposed and not empirically validated. Therefore, in this paper, the authors present the empirical validation conducted on a set of 57 subjects to thoroughly validate the set of 22 quality factors and the initial structure of the multi-level, multi-dimensional WebQMDW quality model. The validated and restructured WebQMDW model thus obtained can significantly enhance the decision-making in the DW by selecting high-quality Web Data Sources.

Highlights

  • The importance of incorporating external data in the Data Warehouse to gain valuable insights into the market, competitors, products, or customers for comprehensive and unbiased decision making, has been long recognized in the research literature [1], [2]

  • In order to fill this research gap, in this paper, we present the empirical validation of the state-of-the-art multi-level, multidimensional WebQMDW (Web quality model for evaluating web sources for the DW) quality model [27] to enhance the decision making in a Data Warehouse

  • In this paper, we use the survey method as the validation method to thoroughly validate the set of quality factors and the structure of the WebQMDW model while following the guidelines and principles of research proposed by Kitchenham and Pfleeger [38]–[43]

Read more

Summary

INTRODUCTION

The importance of incorporating external data in the Data Warehouse to gain valuable insights into the market, competitors, products, or customers for comprehensive and unbiased decision making, has been long recognized in the research literature [1], [2]. For the aforementioned task of quality-aware evaluation of Web Data Sources for a Data Warehouse, various quality models, frameworks, or a set of factors have been proposed in the research literature (see, for example, [19], [22]–[24],[4], [21], [20], [25], [26]). In order to fill this research gap, in this paper, we present the empirical validation of the state-of-the-art multi-level, multidimensional WebQMDW (Web quality model for evaluating web sources for the DW) quality model [27] to enhance the decision making in a Data Warehouse.

Related Work and the WebQMDW Quality Model
Motivation
VALIDATION PROCESS OF THE WEBQMDW MODEL
Research Methodology
Setting of Objectives
Selection of Subjects
Selection of the Design of the Survey
Preparation of the Survey Instrument
Validation of the Survey Instrument
Administration of Survey
Analysis of the Data
Restructuring of the Quality Factors of the WebQMDW Model
THREATS TO VALIDITY
Internal Validity
External Validity
Conclusion Validity
CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.