Abstract

The quality of fragments allocation is key for improving performance of join query in distributed database. Current strategies concentrate on using heuristic rules to allocate fragments to corresponding locations, such as picking the location with maximum required data or with greedy algorithm. Notwithstanding their benefits, under distributed environment, facing various query plans, different data distributions and expensive network cost, their scene-sensitive character may easily generate low quality allocation plan due to lack of generalization ability.In this paper, for breaking this limitation, we propose a general strategy for allocating fragments(AlCo, Allocate fragments based on Cost). AlCo evaluates multiple candidate allocation plans based on cost, which is realized by a modified genetic algorithm employed from PostgreSQL. Our fitness function (cost model) synthetically considers various changeable factors to support generalization ability. For reducing the risks caused by randomization of genetic algorithm, AlCo provides an upper bound computed through current heuristic methods to improve the robustness of our genetic algorithm. We implement AlCo in a distributed database system, and the experiments show that, on TPC-H benchmark, AlCo is up to 2x–4x better on performance than existing strategies and performs well in robustness and scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call