Abstract

Text-to-SQL is one of semantic parsing methods that converts natural language questions into SQL queries, and it aims to extract data from any relational database without knowledge of SQL query configuration. Although development of large amounts of datasets (WikiSQL, SPIDER) and development of pre-trained language models (BERT) contributed to the improvement of Text-to-SQL performance in English, language-specific dataset construction and model research have not been much progressed. Therefore, this study proposes a multilingual BERT-based Text-to-SQL methodology that converts the natural language question in Korean into SQL query for an English database. To this end, four strategies for translating Korean queries into English were explored, and their effectiveness was verified by applying each strategy to three text-to-SQL model structures. As a result of the experiment, it was confirmed that it showed a significant SQL generation performance even for Korean questions. The proposed methodology is meaningful in that it shows semantic inferences between database tables, column information, and questions composed of different languages are possible, and it is expected to support efficient database access by Korean users who lack proficiency in writing SQL queries.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.