Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Alessandra Giordani,Alessandro Moschitti

doi:10.1007/978-3-642-31178-9_16

Abstract

This research concerns with translating natural language questions into SQL queries by exploiting the MySQL framework for both hypothesis construction and thesis verification in the task of question answering. We use linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses. Then we exploit again the metadata to build FROM clauses enriched with meaningful joins. Finally, we combine all the clauses to get the set of all possible SQL queries, producing an answer to the question. Our algorithm can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Additionally, it proposes a weighting scheme to order all the generated queries in terms of probability of correctness. Our preliminary results are encouraging as they show that our system generates the right SQL query among the first five in the 92% of the cases. This result can be greatly improved by re-ranking the queries with a machine learning methods.

Full Text