Abstract

Big data, characterized by huge volumes of continuously varying streams of information, present formidable challenges in terms of acquisition, processing, and transmission, especially when one considers novel technology platforms such as the Internet-of-Things and Wireless Sensor Networks. Either by design or by physical limitations, a large number of measurements never reach the central processing stations, making the task of data analytics even more problematic. In this work, we propose Singular Spectrum Matrix Completion (SS-MC), a novel approach for the simultaneous recovery of missing data and the prediction of future behavior in the absence of complete measurement sets. The goal is achieved via the solution of an efficient minimization problem which exploits the low rank representation of the associated trajectory matrices when expressed in terms of appropriately designed dictionaries obtained by leveraging the theory of Singular Spectrum Analysis. Experimental results in real datasets demonstrate that the proposed scheme is well suited for the recovery and prediction of multiple time series, achieving lower estimation error compared to state-of-the-art schemes.

Highlights

  • The dynamic nature of Big Data, a feature termed velocity, is a critical aspect of massive data streams from a signal processing viewpoint [1]

  • 6 Experimental results To evaluate the performance of the proposed low-rank reconstruction and prediction scheme, we consider real data from the Intel Berkeley Research Lab dataset1 [56] and the SensorScope Grand St-Bernard dataset2 [57]. The former dataset contains the recordings of 54 multimodal sensors located in an indoor environment over a 1-month period, while the latter contains multimodal measurements from 23 stations deployed at the Grand-St-Bernard pass between Switzerland and Italy

  • The threshold τ for the singular value thresholding operator is set to preserve 90 % of the signals’ energy, while the parameter μ was set to 0.01 through a validation process, the specific value had a minimal impact in performance

Read more

Summary

Introduction

The dynamic nature of Big Data, a feature termed velocity, is a critical aspect of massive data streams from a signal processing viewpoint [1]. Due to the high velocity of the input streams, measurements may be missing with a high probability This phenomenon can be attributed to three factors, namely: (a) intentionally collecting a subset of the measurements for efficiency purposes; (b) unintentional subsampling due to desynchronization; and (c) missing measurements due to communications errors including packet drops, outages, and congestion. To elaborate on these factors, we consider data streams associated with the Internet-of-Things (IoT) paradigm and we focus on Wireless Sensor Networks (WSNs) since WSNs can serve as an enabling platform for IoT applications [2, 3]. In addition to energy consumption and desynchronization, missing measurements can be attributed to network outages and packet losses, which are frequent in WSNs deployed in harsh and cluttered environments, causing a large number of packets to fail in reaching their destination

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.