Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

Moshe Hazoom,Vibhor Malik,Ben Bogin

doi:10.18653/v1/2021.nlp4prog-1.9

Abstract

Most available semantic parsing datasets, comprising of pairs of natural utterances and logical forms, were collected solely for the purpose of training and evaluation of natural language understanding systems. As a result, they do not contain any of the richness and variety of natural-occurring utterances, where humans ask about data they need or are curious about. In this work, we release SEDE, a dataset with 12,023 pairs of utterances and SQL queries collected from real usage on the Stack Exchange website. We show that these pairs contain a variety of real-world challenges which were rarely reflected so far in any other semantic parsing dataset, propose an evaluation metric based on comparison of partial query clauses that is more suitable for real-world queries, and conduct experiments with strong baselines, showing a large gap between the performance on SEDE compared to other common datasets.

Highlights

The task of mapping natural language into logical forms that can be executed on a database or knowledge graph, has been studied mostly on academic datasets, where both the utterances and the queries were written as part of a dataset collection process (Hemphill et al, 1990; Zelle and Mooney, 1996; Yu et al, 2018), and not in a natural process where users ask questions about data they need or are curious about
Compared to other Text-to-SQL datasets, we show that SEDE contains at least 10 times more SQL queries templates than other datasets, and has the most diverse set of utterances and SQL queries out of all singledomain datasets
Standard evaluation metrics such as denotation accuracy and exact comparison of SQL components can often be used with relative success, but we found this to be a greater challenge in SEDE

Summary

Introduction

The task of mapping natural language into logical forms that can be executed on a database or knowledge graph, has been studied mostly on academic datasets, where both the utterances and the queries were written as part of a dataset collection process (Hemphill et al, 1990; Zelle and Mooney, 1996; Yu et al, 2018), and not in a natural process where users ask questions about data they need or are curious about. Denotation accuracy is inaccurate for under-specified utterances, where any single clause not mentioned in the question could entirely change execution results, while exact match comparison of SQL components (e.g. comparing all SELECT, WHERE, GROUP BY and ORDER BY clauses) are often too strict when queries are highly complex. While solving these issues still remains an open problem, to at least partially address them we propose to measure a softer version of the exact match metric, PCM-F1, based on partially extracted queries components, and show that this metric gives a better indication of models’ performance than common metrics, which yield a score that is close to 0. We hope that the unique and challenging properties exhibited in SEDE1 will pave a path for future work on gen-

Background

Stack Exchange Data Explorer

Data cleaning

Dataset Characteristics

Limitations

Evaluation

Sub-tree elements matching

Experimantal Setup

Main Results

PCM-F1 Validation

Error Analysis

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 5	License type: cc-by

Similar Papers

MSCTD: A Multimodal Sentiment Chat Translation Dataset
Yunlong Liang ... Jinan Xu
-
Yunlong Liang, et. al.Yunlong Liang ... Jinan Xu
01 Jan 2021
01 Jan 2021

TaPas: Weakly Supervised Table Parsing via Pre-training
Jonathan Herzig ... Pawel Krzysztof Nowak
-
Jonathan Herzig, et. al.Jonathan Herzig ... Pawel Krzysztof Nowak
01 Jan 2020
01 Jan 2020

How Would You Say It? Eliciting Lexically Diverse Dialogue for Supervised Semantic Parsing
Abhilasha Ravichander ... Jonathan Francis
-
Abhilasha Ravichander, et. al.Abhilasha Ravichander ... Jonathan Francis
01 Jan 2017
01 Jan 2017

On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries
Tianze Shi ... Hal Daumé Iii
-
Tianze Shi, et. al.Tianze Shi ... Hal Daumé Iii
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

Abstract

Highlights

Summary

Talk to us

Similar Papers