Abstract

With the rapid growth in the use of various smart digital sensors, the Internet of Things (IoT) is a swiftly growing technology, which has contributed significantly to Industry 4.0 and the promotion of IoT-based smart factories, which gives rise to the new challenges of big data analytics and the implementation of machine learning techniques. This article proposes a practical framework that combines IoT techniques, a data lake, data analysis, and cloud computing for manufacturing equipment health-state monitoring and diagnostics in smart manufacturing. It addresses all the required aspects in the realization of such a system and allows the seamless interchange of data and functionality. Due to the specific characteristics of IoT sensor data (low quality, redundant multisources, partial labeling), we not only provide a promising framework but also give detailed insights and pay considerable attention to data quality issues. In the proposed framework, an ingestion procedure is designed to manage data collection, data security, data transformation and data storage issues. To improve the quality of IoT big data, a high-noise feature filter is proposed for automated preliminary sensor selection to suppress noisy features, followed by a noisy data cleaning module to provide good quality data for unbiased diagnosis modeling. The proposed framework can achieve seamless integration between IoT big data ingestion from the physical factory and machine learning-based data analytics in the virtual systems. It is built on top of the Apache Spark processing engine, being capable of working in both big data and real-time environments. One case study has been conducted based on a four-stage syngas compressor from real industries, which won the Best Industry Application of IoT at the BigInsights Data & AI Innovation Awards. The experimental results demonstrate the effectiveness of both the proposed IoT-architecture and techniques to address the data quality issues.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call