Abstract

With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.

Highlights

  • Pattern query is a fundamental operation for graph query processing, which usually occurs in social network analysis [1], biological network analysis [2], transaction scheduling [3], knowledge graph search [4], and access control [5]

  • Cypher Queries we present the overview of the Cypher query process pipeline and briefly describe the query plan and operators

  • Where n is denoted as the number of templates, yi is denoted as the prediction values for Cypher queries, and yi is denoted as the actual values

Read more

Summary

Introduction

Pattern query is a fundamental operation for graph query processing, which usually occurs in social network analysis [1], biological network analysis [2], transaction scheduling [3], knowledge graph search [4], and access control [5]. Graph pattern query performance prediction before its execution has been a significant issue in modern database management systems (DBMS) [8,9,10]. Graph query performance prediction features selection is a challenge; the quality of the prediction models depends on the selected query features, and irrelevant features will increase noise during the training process To address these challenges, we proposed a learning-based graph pattern queries execution time neural network prediction approach. We used the RBF neural network to train the feature vectors and used the model to predict new Cypher queries. To capture the query structure, we proposed a pattern-modeling method and encoded them into corresponding feature vectors. The rest of this paper is organized as follows: Section 2 describes the substantial research works of query performance prediction; Section 3 briefly introduces the Cypher queries and prediction model architecture; Section 4 presents our query feature modeling and prediction approaches; the evaluation and analysis of the experimental results are performed in Section 5; and Section 6 details the conclusions of this paper

Relational-Based Queries
Graph-Based Queries
Overview of Performance Prediction Architecture
Prediction Model
Monitor
Evaluation Techniques
Experiment Results of Prediction Models
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call