Abstract

Since the emergence of sensor data streams, increasing amounts of observations have to be transmitted, stored and retrieved. Performing these tasks at the granularity of single points would mean an inappropriate waste of resources. Thus, we propose a concept that performs a partitioning of observations by spatial, temporal or other criteria (or a combination of them) into data segments. We exploit the resulting proximity (according to the partitioning dimension(s)) within each data segment for compression and efficient data retrieval. While in principle allowing lossless compression, it can also be used for progressive transmission with increasing accuracy wherever incremental data transfer is reasonable. In a first feasibility study, we apply the proposed method to a dataset of ARGO drifting buoys covering large spatio-temporal regions of the world´s oceans and compare the achieved compression ratio to other formats.

Highlights

  • Data compression is one key aspect of managing sensor data streams, since technological progress about transfer rate, processing power and memory size tends to be outperformed by the ever-growing amount of available observations

  • The principle applied for our compression method is derived from the Binary Space Partitioning tree (BSP tree), see (Samet, 2006)

  • The methodology presented here is useful for situations where massive sensor data need to be compressed in a way that allows a progressive retrieval with increasing accuracy per step. It supports the most typical data types found in sensor data like Float/Double, Integer, Boolean, and DateTime, each one with specific compression methodologies

Read more

Summary

Introduction

Data compression is one key aspect of managing sensor data streams, since technological progress about transfer rate, processing power and memory size tends to be outperformed by the ever-growing amount of available observations. Compression methods should take into account the specific structure of the data they are applied to. Sensor observations typically describe continuous or quantitative variables in multiple dimensions like latitude and longitude, time, temperature, pressure, voltage, etc. Where these data, at least locally, tend to be stationary in space and time, there is high potential for compression: the actual values within a period and/or spatial range usually cover only a small range of values compared to the domain represented by a standard data type like floating-point numbers. We propose to perform a partitioning of observations by spatial, temporal or other criteria (or a combination of them) into data segments. For data retrieval from spatial or spatio-temporal databases, the creation of such data segments is reasonable

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call