Efficient Processing and Reasoning of Semantic Streams

Shen Gao

doi:10.5167/uzh-162907

Abstract

The digitalization of our society creates a large number of data streams, such as stock tickers, tweets, and sensor data. Making use of these streams has tremendous values. In the Semantic Web context, live information is queried from the streams in real-time. Knowledge is discovered by integrating streams with data from heterogeneous sources. Moreover, insights hidden in the streams are inferred and extracted by logical reasoning. Handling large and complex streams in real-time challenges the capabilities of current systems. Therefore, this thesis studies how to improve the efficiency of processing and reasoning over semantic streams. It is composed of three projects that deal with different research problems motivated by real-world use cases. We propose new methods to address these problems and implement systems to test our hypotheses based on real datasets. The first project focuses on the problem that sudden increases in the input stream rate overload the system, causing a reduced or unacceptable performance. We propose an eviction technique that, when a spike in the input data rate happens, discards data from the system to ensure the response latency at the cost of a lower recall. The novelty of our solution lies in a data-aware approach that carefully prioritizes the data and evicts the less important ones to achieve a high result recall. The second project studies complex queries that need to integrate streams with remote and external background data (BGD). Accessing remote BGD is a very expensive process in terms of both latency and financial cost. We propose several methods to minimize the cost by exploiting the query and the data patterns. Our system only needs to retrieve data that are more critical to answer the query and avoids wasting resources on the remaining data in BGD. Lastly, as noise is inevitable in real-world semantic streams, the third project inves- tigates how to use logical reasoning to identify and exclude the noise from high-volume streams. We adopt a distributed stream processing engine (DSPE) to achieve scalability. On top of a DSPE, we optimize the reasoning procedures by balancing the costs of com- putation and communication. Therefore, reasoning tasks are compiled into efficient DSPE workflows that can be deployed across large-scale computing clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Processing and Reasoning of Semantic Streams

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Efficient Incremental Cooccurrence Analysis for Item-Based Collaborative Filtering
Sebastian Schelter ... Ted Dunning
-
Sebastian Schelter, et. al.Sebastian Schelter ... Ted Dunning
23 Jul 2019
23 Jul 2019

Delivering Real-Time Information Services on Public Transit: A Framework
Tianyi Ma ... Kaixu Liu
IEEE Transactions on Intelligent Transportation Systems | VOL. 18
Tianyi Ma, et. al.Tianyi Ma ... Kaixu Liu
01 Oct 2017
IEEE Transactions on Intelligent Transportation Systems | VOL. 18

An Optimal Checkpointing Model with Online OCI Adjustment for Stream Processing Applications
Yuan Zhuang ... Xiaohui Wei
-
Yuan Zhuang, et. al.Yuan Zhuang ... Xiaohui Wei
01 Jul 2018
01 Jul 2018

Evaluating the integration of Esper complex event processing engine and message brokers.
Guadalupe Ortiz ... Winfried Lamersdorf
PeerJ Computer Science | VOL. 9
Guadalupe Ortiz, et. al.Guadalupe Ortiz ... Winfried Lamersdorf
12 Jul 2023
PeerJ Computer Science | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Processing and Reasoning of Semantic Streams

Abstract

Talk to us

Similar Papers