Abstract
The growing number of connected Internet of Things (IoT) devices has increased the necessity for processing IoT data from multiple heterogeneous data stores. IoT data integration is a challenging problem owing to the heterogeneity of data stores in terms of their query language, data models, and schemas. In this paper, we propose a multi-store query system for IoT data called MusQ, where users can formulate join operation queries for heterogeneous data sources. To reconcile the heterogeneity between source schemas of IoT data stores, we extract a global schema from local source schemas semi-automatically by applying schema-matching and schema-mapping steps. In order to minimize the burden on the user to understand the finer details of various query languages, we define a unified query language called the multi-store query language (MQL), which follows a subset of the Datalog grammar. Thus, users can easily retrieve IoT data from multiple heterogeneous sources with MQL queries. As the three MQL query-processing join algorithms are based on a mediator–wrapper approach, MusQ performs efficient data integration over significant volumes of IoT data from multiple stores. We conduct extensive experiments to evaluate the performance of the MusQ system using a synthetic and large real IoT data set for three different types of data stores (RDBMS, NoSQL, and HDFS). The experimental results demonstrate that MusQ is suitable, scalable, and efficient query processing for multiple heterogeneous IoT data stores. Those advantages of MusQ are important in several areas that involve complex IoT systems, such as smart city, healthcare, and energy management.
Highlights
The proliferation of Internet of Things (IoT) technology has led to a rapid deployment of a massive number of IoT devices [1]
In this paper, we have presented the design of MusQ for building a multi-store query processing system
MusQ semiautomatically constructs the global schema from local source schemas using schema-matching and schema-mapping steps to provide integrated access to multiple data sources
Summary
The proliferation of Internet of Things (IoT) technology has led to a rapid deployment of a massive number of IoT devices [1]. To build a useful multi-store system for IoT data, we investigate three key features by addressing three critical challenges: (1) constructing a global schema from local source schemas by exploiting relationships, as in the federated approach; (2) performing complex queries on multiple data stores without knowing all the different query languages; and (3) efficiently executing user queries, especially join operations, to retrieve relevant data from local sources and merging them into a final result. To address these challenges, we propose a multi-store query system called MusQ, which uses the federated approach.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have