Abstract

A wireless sensor network (WSN) is one of the most typical applications of the Internet of Things (IoT). Missing values exist in the sensor data streams unavoidably because of the way WSNs work and the environments they are deployed in. In most cases, imputing missing values is the universally adopted approach before making further data processing. There are different ways to implement it, among which the exploitation of correlation information hidden in the sensor data interests many researchers, and lots of results have emerged. Researching in the same way, in this paper, we propose VTN imputation, an online missing data imputation algorithm based on virtual temporal neighbors. Firstly, the virtual temporal neighbor (VTN) in the sensor data stream is defined, and the calculation method is given. Next, the VTN imputation algorithm, which applies VTN to make estimates for missing values by regression is presented. Finally, we make experiments to evaluate the performance of imputing accuracy and computation time for our algorithm on three different real sensor datasets. The experiment results show that the VTN imputation algorithm benefited from the fuller exploitation of the correlation in sensor data and obtained better accuracy of imputation and acceptable processing time in the real applications of WSNs.

Highlights

  • With the development of the Internet of things (IoT) [1], nowadays more devices and sensors are deployed in the physical environments

  • In this paper, addressing the above problems, we propose a new imputation algorithm virtual temporal neighbor (VTN) that works for online sensor data stream in wireless sensor network (WSN). e main contributions of our work are described as follows: VTN works based on the temporal neighbors in SDS and only requires the measurement data from one senor on the node

  • By doing the way in the second meaning, in other words, to get extra information from the measurement value improves the accurate of imputation in VTN algorithm. ere are two crucial points in VTN, one is that the virtual temporal neighbor of a value is calculated based on the two values before and after the time point of the value, which boosts the accuracy of the virtual temporal neighbor. e other is that only the virtual temporal neighbors, which are the closest to the temporal neighbor of the missing value in values and change rates are used in the regression to calculate the

Read more

Summary

Introduction

With the development of the Internet of things (IoT) [1], nowadays more devices and sensors are deployed in the physical environments. Most of the applications demand complete datasets, i.e., there do not exist missing values in the data obtained from the WSNs because the missing values degrade the performance of the processing algorithms and even make them inapplicable. In an application that is applied to recognize human activities based on the measurement values obtained from the sensors, such as accelerometer and gyroscopes, where the random forest classifier and support vector machine (SVM) are used for classification, the research work shows that 5% missing rate of values in the dataset makes the performance of HASC recognition decrease to 83% and 84%, respectively, 20% missing rate makes them drop down to 45% and 46%, which is unacceptable for the application [6]. Factors including the signal strength fading and interferences from the environment bring about 9% to 17%

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call