Abstract

IoT systems are being used widely place in manufacturing. The volume of thesensor data in these systems is significant. In real-life scenarios, missing sensor data can cause problems, especially for data-driven machine learning (ML) models. The gaps due to missing sensor data should be handled before employing machine learning models. The common practices are to remove the missing data completely or apply simple arithmetic operations. However, there are more sophisticated approaches in the literature that can be applied to these real-time IoT systems considering the native data characteristics. This study compares the performance of regression-based ML algorithms missing data imputation methods such as Support Vector Regression (SVR), Decision Tree Regression (DTR), Ridge Regression, K-Nearest Neighbors Regression (KNN), MissForest (MF), and XGBoost Regression (XGB). Missing data in different positions and proportions are created utilizing experimentally collected time-series sensor data from a newly developed IoT system platform. The initial work based on the ML models is presented on these datasets together with an overview of the IoT system architecture. The average RMSE and R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> values of the six ML models showed that the Ridge Regression outperforms the other ML models for the missing data imputation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.