Abstract

IoT sensors are becoming increasingly important supplement to traditional monitoring systems, particularly for in-situ based monitoring. Data collected using IoT sensors are often plagued with missing values occurring as a result of sensor faults, network failures, drifts and other operational issues. Missing data can have substantial impact on in-field sensor calibration methods. The goal of this research is to achieve effective calibration of sensors in the context of such missing data. To this end, two objectives are presented in this paper. 1) Identify and examine effective imputation strategy for missing data in IoT sensors. 2) Determine sensor calibration performance using calibration techniques on data set with imputed values. Specifically, this paper examines the performance of Variational Autoencoder (VAE), Neural Network with Random Weights (NNRW), Multiple Imputation by Chain Equations (MICE), Random Forest-based Imputation (missForest) and K-Nearest Neighbour (KNN) for imputation of missing values on IoT sensors. Furthermore, the performance of sensor calibration via different supervised algorithms trained on the imputed dataset were evaluated. The analysis showed VAE technique to outperform the other methods in imputing the missing values at different proportions of missingness on two real-world datasets. Experimental results also showed improved calibration performance with imputed dataset.

Highlights

  • Expanding the measurement networks for Green House Gases (GHG) is vital for understanding GHG global emission trends and the effectiveness of emission mitigation policies, strategies and initiatives, making it possible to ascertain how far emission reduction targets are being met at the local, regional and global scales [1].Low Cost Sensors (LCS) have the potentials to enhance the spatio-temporal resolution of data acquisition for key GHG variables

  • The analysis shows that at any measurement point, the concentrations of auxiliary variables such as T, Relative Humidity (RH), and other sensor variables with non-missing values exhibit important correlation that could be exploited by the imputation methods to predict missing values on a target variable

  • As it would be impossible to assess the performance of imputation strategies when the real values are unknown, we introduced missing values to the datasets following two distinct patterns to assess the ability of the imputation strategies

Read more

Summary

Introduction

Expanding the measurement networks for Green House Gases (GHG) is vital for understanding GHG global emission trends and the effectiveness of emission mitigation policies, strategies and initiatives, making it possible to ascertain how far emission reduction targets are being met at the local, regional and global scales [1].Low Cost Sensors (LCS) have the potentials to enhance the spatio-temporal resolution of data acquisition for key GHG variables. LCS, are prone to diverse issues including bias, drifts, precision degradation, and loss of considerable amount of data due to operational issues [2]. The European Union Data Quality Directive (EU-DQD) de-. Fined the data quality objective (DQO) that a monitoring method needs to comply with to be used as indicative measurement for regulative purposes [9]. The directive defined the degree of data completeness for such monitoring method. To meet these requirements and to present LCS as suitable for adoption for this purpose, data completeness is essential for the sensors.

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call