Abstract

This paper addresses the problem of mapping Natural Language to SQL queries. It assumes that the input is in English language and details a methodology to build a SQL query based on the input sentence, a dictionary and a set of production rules. The dictionary consists of semantic sets and index files. A semantic set is created for each table or attribute name and contains synonyms, hyponyms and hypernyms as retrieved by WordNet and complemented manually. The index files contain pointers to records in the database, ordered by value and by data type. The dictionary and the production rules form a context-free grammar for producing the SQL queries. The context ambiguities are addressed through the use of the derivationally related forms based on WordNet. Building the run time semantic sets of the input tokens helps solving the ambiguities related to the database schema. The proposed method introduces two functional entities: a pre-processor and a runtime engine. The pre-processor reads the database schema and uses WordNet to create the semantic sets and the set of production rules. It also reads the database records and creates the index files. The run time engine matches the input tokens to the dictionary and uses the rules to create the corresponding SQL query.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call