Batch Query Research Articles

In recent years there has been an upsurge of interest in spatial databases. A major issue is how to manipulate efficiently massive amounts of spatial data stored on disk in multidimensional spatial indexes (data structures). Construction of spatial indexes (bulk loading ) has been studied intensively in the database community. The continuous arrival of massive amounts of new data makes it important to update existing indexes (bulk updating ) efficiently.In this paper we present a simple, yet efficient, technique for performing bulk update and query operations on multidimensional indexes. We present our technique in terms of the so-called R-tree and its variants, as they have emerged as practically efficient indexing methods for spatial data. Our method uses ideas from the buffer tree lazy buffering technique and fully utilizes the available internal memory and the page size of the operating system. We give a theoretical analysis of our technique, showing that it is efficient both in terms of I/ O communication, disk storage, and internal computation time. We also present the results of an extensive set of experiments showing that in practice our approach performs better than the previously best known bulk update methods with respect to update time, and that it produces a better quality index in terms of query performance. One important novel feature of our technique is that in most cases it allows us to perform a batch of updates and queries simultaneously. To be able to do so is essential in environments where queries have to be answered even while the index is being updated and reorganized.

This paper describes approaches to improve the performance of one of the most common and increasingly important aspects of the Human Genome Project (HGP) — large-volume, batch comparison of DNA sequence data. This basic comparison operation, usually carried out by the well-known BLAST program on one subject sequence against the internationally available databases of nearly five million target sequences, is already used hundreds of thousands of times each day by researchers around the world. At present, it is still used primarily in single query, or small batch query mode. As the entire sequence of the human genome nears completion, the area of functional genomics, and the use of micro-arrays of sets of genes, is coming to the fore. These developments will demand ever more efficient means of BLASTing sets of data that will make single processor implementation on powerful workstations infeasible. We describe the three primary parallel components to BLAST. The first is at the sequence-to-sequence comparison level. The second parallelizes a single query across a partitioned and distributed database. Finally, the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases. The three methods may be employed alone or in concert. Our current implementation is described which parallelizes batch requests, and our plans for implementation of the other levels is also described. The results will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation.

Batch Query Research Articles

Related Topics

Articles published on Batch Query

Discrete profile comparison using information bottleneck

Online Aggregation on Data Cubes Without Auxiliary Information

Pipelining in multi-query optimization

Efficient Bulk Operations on Dynamic R-Trees

Parallelization of local BLAST service on workstation clusters

Online dynamic reordering

Multiple Query Optimization with Depth-First Branch-and-Bound and Dynamic Query Ordering

Optimal sample cost residues for differential database batch query problems

Analysis of recursive batched interpolation search

Estimating disk head movement in batched searching

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Batch Query Research Articles

Related Topics

Articles published on Batch Query

Discrete profile comparison using information bottleneck

Online Aggregation on Data Cubes Without Auxiliary Information

Pipelining in multi-query optimization

Efficient Bulk Operations on Dynamic R-Trees

Parallelization of local BLAST service on workstation clusters

Online dynamic reordering

Multiple Query Optimization with Depth-First Branch-and-Bound and Dynamic Query Ordering

Optimal sample cost residues for differential database batch query problems

Analysis of recursive batched interpolation search

Estimating disk head movement in batched searching