Abstract

Big Data wave has led to a rapid increase in the amount of data being collected by organizations. While the accuracy and reliability of prediction models are often prioritized, the quality of the collected data is frequently overlooked. Poor data quality can result in the common problem of ‘garbage in, garbage out’. Traditional measures of data quality, such as accuracy, consistency, completeness, and timeliness, are no longer adequate in the era of Big Data. Therefore, this paper proposes a taxonomy of data quality dimensions specifically for Big Data, addressing emerging challenges by formulating 20 dimensions and categorizing them into four distinct categories.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call