Tuple Structures Research Articles

Keyword search in relational databases has been extensively studied. Given a relational database, a keyword query finds a set of interconnected tuple structures connected by foreign key references. On rdbms, a keyword query is processed in two steps, namely, candidate networks ( CN s) generation and CN s evaluation, where a CN is an sql. In common, a keyword query needs to be processed using over 10,000 sqls. There are several approaches to process a keyword query on rdbms, but there is a limit to achieve high performance on a uniprocessor architecture. In this paper, we study parallel computing keyword queries on a multicore architecture. We give three observations on keyword query computing, namely, a large number of sqls that needs to be processed, high sharing possibility among sqls, and large intermediate results with small number of final results. All make it challenging for parallel keyword queries computing. We investigate three approaches. We first study the query level parallelism, where each sql is processed by one core. We distribute the sqls into different cores based on three objectives, regarding minimizing workload skew, minimizing intercore sharing and maximizing intra-core sharing respectively. Such an approach has the potential risk of load unbalancing through accumulating errors of cost estimation. We then study the operation level parallelism, where each operation of an sql is processed by one core. All operations are processed in stages, where in each stage the costs of operations are re-estimated to reduce the accumulated error. Such operation level parallelism still has drawbacks of workload skew when large operations are involved and a large number of cores are used. Finally, we propose a new algorithm that partitions relations adaptively in order to minimize the extra cost of partitioning and at the same time reduce workload skew. We conducted extensive performance studies using two large real datasets, DBLP and IMDB , and we report the efficiency of our approaches in this paper.

Read full abstract

It is widely recognized that the integration of information retrieval (IR) and database (DB) techniques provides users with a broad range of high quality services. Along this direction, IR-styled m-keyword query processing over a relational database in an rdbms framework has been well studied. It finds all hidden interconnected tuple structures, for example connected trees that contain keywords and are interconnected by sequences of primary/foreign key relationships among tuples. A new challenging issue is how to monitor events that are implicitly interrelated over an open-ended relational data stream for a user-given m-keyword query. Such a relational data stream is a sequence of tuple insertion/deletion operations. The difficulty of the problem is related to the number of costly joins to be processed over time when tuples are inserted and/or deleted. Such cost is mainly affected by three parameters, namely, the number of keywords, the maximum size of interconnected tuple structures, and the complexity of the database schema when it is viewed as a schema graph. In this paper, we propose new approaches. First, we propose a novel algorithm to efficiently determine all the joins that need to be processed for answering an m-keyword query. Second, we propose a new demand-driven approach to process such a query over a high speed relational data stream. We show that we can achieve high efficiency by significantly reducing the number of intermediate results when processing joins over a relational data stream. The proposed new techniques allow us to achieve high scalability in terms of both query plan generation and query plan execution. We conducted extensive experimental studies using synthetic data and real data to simulate a relational data stream. Our approach significantly outperforms existing algorithms.

Read full abstract

Tuple Structures Research Articles

Articles published on Tuple Structures

A New Unit Selection Optimisation Algorithm for Corpus-Based TTS Systems Using the RBF-Based Data Compression Technique

Top-k keyword search with recursive semantics in relational databases

Novel LVCSR Decoder Based on Perfect Hash Automata and Tuple Structures – SPREAD –

Ten thousand SQLs

FORMALIZATION OF TEXTUAL USE CASES BASED ON PETRI NETS

Scalable keyword search on large data streams

Modeling software testing costs and risks using fuzzy logic paradigm

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Tuple Structures Research Articles

Articles published on Tuple Structures

A New Unit Selection Optimisation Algorithm for Corpus-Based TTS Systems Using the RBF-Based Data Compression Technique

Top-k keyword search with recursive semantics in relational databases

Novel LVCSR Decoder Based on Perfect Hash Automata and Tuple Structures – SPREAD –

Ten thousand SQLs

FORMALIZATION OF TEXTUAL USE CASES BASED ON PETRI NETS

Scalable keyword search on large data streams

Modeling software testing costs and risks using fuzzy logic paradigm