Abstract

AbstractThe Internet of Things (IoT) is one of the driving forces behind Industry 4.0 and has the potential to improve the entire value chain, especially in the context of industrial manufacturing. However, results derived from IoT data are only viable if a high level of data quality is maintained. Thereby, completeness is especially critical, as incomplete data is one of the most common and costly data quality defects in the IoT context. Nevertheless, existing approaches for assessing the completeness of IoT data are limited in their applicability because they assume a known number of real-world entities or that the real-world entities appear in regular patterns. Thus, they cannot handle the uncertainty regarding the number of real-world entities typically present in the IoT context. Against this background, the paper proposes a novel, probability-based metric that addresses these issues and provides interpretable metric values representing the probability that an IoT database is complete. This probability is assessed based on the detection of outliers regarding the deviation between the estimated number of real-world entities and the number of digital entities. The evaluation with IoT data from a German car manufacturer demonstrates that the provided metric values are useful and informative and can discriminate well between complete and incomplete IoT data. The metric has the potential to reduce the cost, time, and effort associated with incomplete IoT data, providing tangible benefits in real-world applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.