Abstract

Abstract. Sequence to sequence models have been widely used in the recent years in the different tasks of Natural Language processing. In particular, the concept has been deeply adopted to treat the problem of translating human language questions to SQL. In this context, many studies suggest the use of sequence to sequence approaches for predicting the target SQL queries using the different available datasets. In this paper, we put the light on another way to resolve natural language processing tasks, especially the Natural Language to SQL one using the method of sketch-based decoding which is based on a sketch with holes that the model incrementally tries to fill. We present the pros and cons of each approach and how a sketch-based model can outperform the already existing solutions in order to predict the wanted SQL queries and to generate to unseen input pairs in different contexts and cross-domain datasets, and finally we discuss the test results of the already proposed models using the exact matching scores and the errors propagation and the time required for the training as metrics.

Highlights

  • Many ways to find solutions for Natural Language Processing (NLP) tasks have been deeply studied, among them, the sequential models that were the pillars for tasks like language translation, Text Summarization, etc

  • Another class of tasks is natural language translation to database languages like SQL, XQuery, Xpath and others (NL to Query). This kind of models was a big step to find a consistent solution for translation Natural language sentences to Structured Queries like SQL, unlike the traditional works that are based on syntactic parsing

  • Unlike the previous cited tasks of NLP, the NL to Query is the task of translating the user question to a database language query using a predefined syntax for the aim to extract data from the database systems

Read more

Summary

Introduction

Many ways to find solutions for Natural Language Processing (NLP) tasks have been deeply studied, among them, the sequential models that were the pillars for tasks like language translation, Text Summarization, etc. On one hand the sequence-to-sequence structure was able to give satisfying result for simple machine learning problems getting rid of the old linguistic techniques and the syntactic methods that lack of precision and suffer when they are exposed to complex inputs and data. Another class of tasks is natural language translation to database languages like SQL, XQuery, Xpath and others (NL to Query). Spider (Yu et al, 2018) is another dataset that covers multiple domains in one corpus This dataset contains about 10000 Question/SQL query pairs that can be used as a starting point for training semantic parsing-based models.

Objectives
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.