An Automated Big Data Quality Anomaly Correction Framework Using Predictive Analysis

Widad Elouataoui,Saida El Mendili,Youssef Gahi

doi:10.3390/data8120182

Abstract

Big data has emerged as a fundamental component in various domains, enabling organizations to extract valuable insights and make informed decisions. However, ensuring data quality is crucial for effectively using big data. Thus, big data quality has been gaining more attention in recent years by researchers and practitioners due to its significant impact on decision-making processes. However, existing studies addressing data quality anomalies often have a limited scope, concentrating on specific aspects such as outliers or inconsistencies. Moreover, many approaches are context-specific, lacking a generic solution applicable across different domains. To the best of our knowledge, no existing framework currently automatically addresses quality anomalies comprehensively and generically, considering all aspects of data quality. To fill the gaps in the field, we propose a sophisticated framework that automatically corrects big data quality anomalies using an intelligent predictive model. The proposed framework comprehensively addresses the main aspects of data quality by considering six key quality dimensions: Accuracy, Completeness, Conformity, Uniqueness, Consistency, and Readability. Moreover, the framework is not correlated to a specific field and is designed to be applicable across various areas, offering a generic approach to address data quality anomalies. The proposed framework was implemented on two datasets and has achieved an accuracy of 98.22%. Moreover, the results have shown that the framework has allowed the data quality to be boosted to a great score, reaching 99%, with an improvement rate of up to 14.76% of the quality score.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data	Publication Date: Dec 1, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Automated Big Data Quality Anomaly Correction Framework Using Predictive Analysis

Abstract

Talk to us

Similar Papers

More From: Data

Lead the way for us

Similar Papers

Data Quality in the Era of Big Data: A Global Review
Widad Elouataoui ... Imane El Alaoui
-
Widad Elouataoui, et. al.Widad Elouataoui ... Imane El Alaoui
01 Jan 2021
01 Jan 2021

An Advanced Big Data Quality Framework Based on Weighted Metrics
Widad Elouataoui ... Saida El Mendili
Big Data and Cognitive Computing | VOL. 6
Widad Elouataoui, et. al.Widad Elouataoui ... Saida El Mendili
09 Dec 2022
Big Data and Cognitive Computing | VOL. 6

Handbook of Data Quality
-
-
--
01 Jan 2013
01 Jan 2013

Relating Big Data and Data Quality in Financial Service Organizations
Agung Wahyudi ... Marijn Janssen
-
Agung Wahyudi, et. al.Agung Wahyudi ... Marijn Janssen
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Automated Big Data Quality Anomaly Correction Framework Using Predictive Analysis

Abstract

Talk to us

Similar Papers

More From: Data