Abstract

Distributed system query processing is an essential factor in the presentation of a distributed data bases. Query processing in rapports of communication cost and processing cost should be minimum in the distributed data bases. As relations are partitioned based on horizontal or vertical partitions, a query is divided in to sub queries on the partitions that require operations at geographically separated databases. Query optimization is a difficult task in a distributed databases environment as data location becomes a foremost consideration. The query optimizer should select an efficient resultant for the given query, If a poor query execution plan is selected it will lead to a poor performance of the database system. Always the cost of execution of the query subjective function of the system resources needed to execute the query. System resources are like CPU time and the number of read, write operations on a relation. Realistic cost estimates of the optimizer need to evaluate the size of sub-queries. This will play a vital role in the selection of the join order of the relations. To approximation the dimensions of sub-queries, the optimizer needs to know the fussiness of the query basesThis paper briefly described join and semi join operation performance in the distributed data bases and analyzed with the practical application.

Highlights

  • Data base management systems and the applications which involve with the large amount of data performance depend on the distributed and parallel processing

  • The local processing phase consists of operations of selections and projections on the relations; the reduction stage apply a chain of reducers like semi joins and joins to reduce the size of relations; and the final processing phase sends all consequential relations to the assembly site where the final result of the query is constructed

  • Main layers are involved in the distributed query optimizations

Read more

Summary

INTRODUCTION

Data base management systems and the applications which involve with the large amount of data performance depend on the distributed and parallel processing. The local processing phase consists of operations of selections and projections on the relations; the reduction stage apply a chain of reducers like semi joins and joins to reduce the size of relations; and the final processing phase sends all consequential relations to the assembly site where the final result of the query is constructed This immature method like sending all relations directly to the third stage , to joins all relations, is hostile due to with huge transmission overhead and because less impact of parallelism .In distributed query processing, partition a relation into number of partitions, union of the partitions to form a entire relation, and transfer a relation or partition from one to another database are frequent operation

METHODOLOGY OF DISTRIBUTED DATABASES
Communication cost in query processing
Performance cost in query processing
COMPARISONS OF JOIN AND SEMI JOIN
DEDUCTION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.