Abstract

After the rise of big data to national strategy, the application of big data in every industry is increasing. The quality of data will directly affect the value of data and influence the analysis and decision of managers. Aiming at the characteristics of big data, such as volume, velocity, variety and value, a quality management framework of big data based on application scenario is proposed, which includes data quality assessment and quality management of structured data, unstructured data and data integration stage. In view of the current structured data leading to the core business of the enterprise, we use the research method to extend the peripheral data layer by layer on the main data. Big data processing technology, such as Hadoop and Storm, is used to construct a big data cleaning system based on semantics. Combined with JStorm platform, a real-time control system for big data quality is given. Finally, a big data quality evaluation system is built to detect the effect of data integration. The framework can guarantee the output of high quality big data on the basis of traditional data quality system. It helps enterprises to understand data rules and increase the value of core data, which has practical application value.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call