Detecting coherent explorations in SQL workloads

Verónika Peralta,Patrick Marcel,Willeme Verdeaux,Aboubakar Sidikhy Diakhaby

doi:10.1016/j.is.2019.101479

Abstract

This paper presents a proposal aiming at better understanding a workload of SQL queries and detecting coherent explorations hidden within the workload. In particular, our work investigates SQLShare (Jain et al., 2016), a database-as-a-service platform targeting scientists and data scientists with minimal database experience, whose workload was made available to the research community. According to the authors of (Jain et al., 2016), this workload is the only one containing primarily ad-hoc hand-written queries over user-uploaded datasets. We analyzed this workload by extracting features that characterize SQL queries and we investigate three different machine learning approaches to use these features to separate sequences of SQL queries into meaningful explorations. The first approach is unsupervised and based only on similarity between contiguous queries. The second approach uses transfer learning to apply a model trained over a dataset where ground truth is available. The last approach uses weak labeling to predict the most probable segmentation from heuristics meant to label a training set. We ran several tests over various query workloads to evaluate and compare the proposed methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information Systems	Publication Date: Dec 9, 2019
Citations: 2	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Detecting coherent explorations in SQL workloads

Abstract

Talk to us

Similar Papers

More From: Information Systems

Lead the way for us

Similar Papers

Mining SQL workloads for learning analysis behavior
Clement Moreau ... Mohamed Ali Hamrouni
Information Systems | VOL. 108
Clement Moreau, et. al.Clement Moreau ... Mohamed Ali Hamrouni
15 Feb 2022
Information Systems | VOL. 108

SQL query log analysis for identifying user interests and query recommendations

-

26 Nov 2020
26 Nov 2020

Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future
Venkata Vamsikrishna Meduri ... Mohamed Sarwat
ACM Transactions on Database Systems | VOL. 46
Venkata Vamsikrishna Meduri, et. al.Venkata Vamsikrishna Meduri ... Mohamed Sarwat
18 Mar 2021
ACM Transactions on Database Systems | VOL. 46

Towards Modelling Insiders Behaviour as Rare Behaviour to Detect Malicious RDBMS Access
Muhammad Imran Khan ... Barry O'Sullivan
-
Muhammad Imran Khan, et. al.Muhammad Imran Khan ... Barry O'Sullivan
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting coherent explorations in SQL workloads

Abstract

Talk to us

Similar Papers

More From: Information Systems