Natural Language Interfaces to Data

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Recent advances in natural language understanding and processing have resulted in renewed interest in natural language interfaces to data, which provide an easy mechanism for non-technical users to access and query the data. While early systems evolved from keyword search and focused on simple factual queries, the complexity of both the input sentences as well as the generated SQL queries has evolved over time. More recently, there has also been a lot of focus on using conversational interfaces for data analytics, empowering a line of business owners and non-technical users with quick insights into the data. There are three main challenges in natural language querying: (1) identifying the entities involved in the user utterance, (2) connecting the different entities in a meaningful way over the underlying data source to interpret user intents, and finally (3) generating a structured query in the form of SQL or SPARQL. There are two main approaches in the literature for interpreting a user’s natural language query. Rule-based systems make use of semantic indices, ontologies, and knowledge graphs to identify the entities in the query, understand the intended relationships between those entities, and utilize grammars to generate the target queries. With the advances in deep learning-based language models, there have been many text-to-SQL approaches that try to interpret the query holistically using deep learning models. Hybrid approaches that utilize both rule-based techniques as well as deep learning models are also emerging by combining the strengths of both approaches. Conversational interfaces are the next natural step to one-shot natural language querying by exploiting query context between multiple turns of conversation for disambiguation. In this monograph, we review the background technologies that are used in natural language interfaces, and survey the different approaches to natural language querying. We also describe conversational interfaces for data analytics and discuss several benchmarks used for natural language querying research and evaluation.

Similar Papers
  • Research Article
  • Cite Count Icon 4
  • 10.1108/jeim-01-2015-0005
A novel method for providing relational databases with rich semantics and natural language processing
  • Apr 10, 2017
  • Journal of Enterprise Information Management
  • Kamal Hamaz + 1 more

PurposeWith the development of systems and applications, the number of users interacting with databases has increased considerably. The relational database model is still considered as the most used model for data storage and manipulation. However, it does not offer any semantic support for the stored data which can facilitate data access for the users. Indeed, a large number of users are intimidated when retrieving data because they are non-technical or have little technical knowledge. To overcome this problem, researchers are continuously developing new techniques for Natural Language Interfaces to Databases (NLIDB). Nowadays, the usage of existing NLIDBs is not widespread due to their deficiencies in understanding natural language (NL) queries. In this sense, the purpose of this paper is to propose a novel method for an intelligent understanding of NL queries using semantically enriched database sources.Design/methodology/approachFirst a reverse engineering process is applied to extract relational database hidden semantics. In the second step, the extracted semantics are enriched further using a domain ontology. After this, all semantics are stored in the same relational database. The phase of processing NL queries uses the stored semantics to generate a semantic tree.FindingsThe evaluation part of the work shows the advantages of using a semantically enriched database source to understand NL queries. Additionally, enriching a relational database has given more flexibility to understand contextual and synonymous words that may be used in a NL query.Originality/valueExisting NLIDBs are not yet a standard option for interfacing a relational database due to their lack for understanding NL queries. Indeed, the techniques used in the literature have their limits. This paper handles those limits by identifying the NL elements by their semantic nature in order to generate a semantic tree. This last is a key solution towards an intelligent understanding of NL queries to relational databases.

  • Research Article
  • Cite Count Icon 8
  • 10.1108/oir-12-2011-0210
Do natural language search engines really understand what users want?
  • Apr 12, 2013
  • Online Information Review
  • Nadjla Hariri

Purpose – The main purpose of this research is to determine whether the performance of natural language (NL) search engines in retrieving exact answers to the NL queries differs from that of keyword searching search engines. Design/methodology/approach – A total of 40 natural language queries were posed to Google and three NL search engines: Ask.com, Hakia and Bing. The first results pages were compared in terms of retrieving exact answer documents and whether they were at the top of the retrieved results, and the precision of exact answer and relevant documents. Findings – Ask.com retrieved exact answer document descriptions at the top of the results list in 60 percent of searches, which was better than the other search engines, but the mean value of the number of exact answer top list documents for three NL search engines (20.67) was a little less than Google's (21). There was no significant difference between the precision for Google and three NL search engines in retrieving exact answer documents for NL queries. Practical implications – The results imply that all NL and keyword searching search engines studied in this research mostly employ similar techniques using keywords of the NL queries, which is far from semantic searching and understanding what the user wants in searching with NL queries. Originality/value – The results shed light into the claims of NL search engines regarding semantic searching of NL queries.

  • Single Report
  • Cite Count Icon 2
  • 10.21236/ada294037
Focus of Attention in Decision Support Systems.
  • May 12, 1995
  • John O Gurney + 3 more

: In this paper we have described three decision support systems with graphical user interfaces. We have illustrated that interactions with such systems can be improved with the addition of natural language interfaces. NLIs can be added to systems after the initial develop- development phase without loss of functionality. Furthermore we have shown that a hybrid natural language and graphical user interface with multi-modal capability improves access to decision aids more than either one independently. We have also shown that natural language interfaces can help a user maintain his focus of attention in complex situations. A natural language interface to a decision support system has a number of important advantages. A natural language interface is especially useful in recognizing what the user is thinking about at the time of interaction, i.e. identifying the user's focus of attention. It provides a way for the user to maintain focus on his problem without distraction. A natural languages interpreter can access the system for the user and assemble a relevant, well-focused response. A NLI allows the user flexibility of expression. This is important if the user does not know how to say what he intends in the GUI format. NLIs can handle complex focus spaces since focus is readily conveyed through natural language. In fact, natural language may be the only way it can be readily handled. Finally, and importantly, natural language interfaces can handle a user's counterfactual thinking. Because of the complexity and power anticipated in next generation decision support systems, natural language interface technology warrants a closer look. It can serve to help the user manage complex and quantitatively massive amounts of information in a natural and easily-easily expressible way. (KAR) p. 16

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/320599.320670
The human factors of natural language query systems
  • Jan 1, 1985
  • William C Ogden

The human factors of natural language query systems

  • Book Chapter
  • Cite Count Icon 7
  • 10.1007/11555261_12
Natural Language Query vs. Keyword Search: Effects of Task Complexity on Search Performance, Participant Perceptions, and Preferences
  • Jan 1, 2005
  • Qianying Wang + 2 more

A 2x2 mixed design experiment (N=52) was conducted to examine the effects of search interface and task complexity on participants’ information-seeking performance and affective experience. Keyword vs. natural language search was the within-participants factor; simple vs. complex tasks was the between-participants factor. There were cross-over interactions such that complex-task participants were more successful and thought the tasks were less difficult and reported more enjoyment and confidence when they used keyword search vs. natural language queries, while the opposite was found for simple-task participants. The findings suggest that natural language search is not the panacea for all information retrieval tasks: task complexity is a critical mediator. Implications for interface design and directions for future research are discussed.

  • Research Article
  • Cite Count Icon 81
  • 10.1109/access.2022.3149798
Framework for Deep Learning-Based Language Models Using Multi-Task Learning in Natural Language Understanding: A Systematic Literature Review and Future Directions
  • Jan 1, 2022
  • IEEE Access
  • Rahul Manohar Samant + 3 more

Learning human languages is a difficult task for a computer. However, Deep Learning (DL) techniques have enhanced performance significantly for almost all-natural language processing (NLP) tasks. Unfortunately, these models cannot be generalized for all the NLP tasks with similar performance. NLU (Natural Language Understanding) is a subset of NLP including tasks, like machine translation, dialogue-based systems, natural language inference, text entailment, sentiment analysis, etc. The advancement in the field of NLU is the collective performance enhancement in all these tasks. Even though MTL (Multi-task Learning) was introduced before Deep Learning, it has gained significant attention in the past years. This paper aims to identify, investigate, and analyze various language models used in NLU and NLP to find directions for future research. The Systematic Literature Review (SLR) is prepared using the literature search guidelines proposed by Kitchenham and Charters on various language models between 2011 and 2021. This SLR points out that the unsupervised learning method-based language models show potential performance improvement. However, they face the challenge of designing the general-purpose framework for the language model, which will improve the performance of multi-task NLU and the generalized representation of knowledge. Combining these approaches may result in a more efficient and robust multi-task NLU. This SLR proposes building steps for a conceptual framework to achieve goals of enhancing the performance of language models in the field of NLU.

  • Conference Article
  • Cite Count Icon 16
  • 10.1109/icassp.2012.6289031
Translating natural language utterances to search queries for SLU domain detection using query click logs
  • Mar 1, 2012
  • Dilek Hakkani-Tur + 3 more

Logs of user queries from a search engine (such as Bing or Google) together with the links clicked provide valuable implicit feedback to improve statistical spoken language understanding (SLU) models. However, the form of natural language utterances occurring in spoken interactions with a computer differs stylistically from that of keyword search queries. In this paper, we propose a machine translation approach to learn a mapping from natural language utterances to search queries. We train statistical translation models, using task and domain independent semantically equivalent natural language and keyword search query pairs mined from the search query click logs. We then extend our previous work on enriching the existing classification feature sets for input utterance domain detection with features computed using the click distribution over a set of clicked URLs from search engine query click logs of user utterances with automatically translated queries. This approach results in significant improvements for domain detection, especially when detecting the domains of user utterances that are formulated as natural language queries and effectively complements to the earlier work using syntactic transformations.

  • Research Article
  • Cite Count Icon 26
  • 10.1098/rstb.2019.0313
Quasi-compositional mapping from form to meaning: a neural network-based approach to capturing neural responses during human language comprehension.
  • Dec 16, 2019
  • Philosophical Transactions of the Royal Society B: Biological Sciences
  • Milena Rabovsky + 1 more

We argue that natural language can be usefully described as quasi-compositional and we suggest that deep learning-based neural language models bear long-term promise to capture how language conveys meaning. We also note that a successful account of human language processing should explain both the outcome of the comprehension process and the continuous internal processes underlying this performance. These points motivate our discussion of a neural network model of sentence comprehension, the Sentence Gestalt model, which we have used to account for the N400 component of the event-related brain potential (ERP), which tracks meaning processing as it happens in real time. The model, which shares features with recent deep learning-based language models, simulates N400 amplitude as the automatic update of a probabilistic representation of the situation or event described by the sentence, corresponding to a temporal difference learning signal at the level of meaning. We suggest that this process happens relatively automatically, and that sometimes a more-controlled attention-dependent process is necessary for successful comprehension, which may be reflected in the subsequent P600 ERP component. We relate this account to current deep learning models as well as classic linguistic theory, and use it to illustrate a domain general perspective on some specific linguistic operations postulated based on compositional analyses of natural language. This article is part of the theme issue 'Towards mechanistic models of meaning composition'.

  • Conference Article
  • 10.1109/user.2012.6226574
Evaluating live sequence charts as a programming technique for non-programmers
  • Jun 1, 2012
  • Michal Gordon + 1 more

Behavioral programming is a recent programming paradigm that uses independent scenarios to program the behavior of reactive systems. Live sequence charts (LSC) is a visual formalism that implements the approach of behavioral programming. The approach attempts to liberate programming by allowing the user to program the behavior of reactive systems by scenarios. We would like to evaluate the approach and seek the naturalness of the best interface for creating the visual artifact of LSCs. Several such interfaces, among which is a novel interactive natural language (NL) interface, exist. Initial testing indicates that the LSCs' NL interface may be preferred by programmers to procedural programming and that in certain tasks LSCs may be a viable and more natural alternative to conventional programming. Many challenges exist in trying to prove the intuitive and natural nature of a new programming paradigm, which differs from others not only in syntax but in many other respects. We describe these challenges in this proposal.

  • Research Article
  • Cite Count Icon 5
  • 10.1002/int.4550100902
A data management strategy for transportable natural language interfaces
  • Jan 1, 1995
  • International Journal of Intelligent Systems
  • Julia A Johnson + 1 more

This thesis focuses on the problem of designing a highly portable domain independent natural language interface for standard relational database systems. It is argued that a careful strategy for providing the natural language interface (NLI) with morphological, syntactic, and semantic knowledge about the subject of discourse and the database is needed to make the NLI portable from one subject area and database to another. There has been a great deal of interest recently in utilizing the database system to provide that knowledge. Previous approaches attempted to solve this challenging problem by capturing knowledge from the relational database (RDB) schema, but were unsatisfactory for the following reasons: 1.) RDB schemas contain referential ambiguities which seriously limit their usefulness as a knowledge representation strategy for NL understanding. 2.) Knowledge captured from the RDB schema is sensitive to arbitrary decisions made by the designer of the schema. In our work we provide a new solution by applying a conceptual model for database schema design to the design of a portable natural language interface. It has been our observation that the process used for adapting the natural language interface to a new subject area and database overlaps considerably with the process of designing the database schema. Based on this important observation, we design an enhanced natural language interface with the following significant features: complete independence of the linguistic component from the database component, economies in attaching the natural language and DB components, and sharing of knowledge about the relationships in the subject of discourse for database schema design and NL understanding.

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/317456.317460
The utility of natural language interfaces (panel session)
  • Jan 1, 1985
  • Philip J Hayes

Natural language interfaces are frequently proposed as a solution to the problems of “user-unfriendliness” present in many existing computer system interfaces. The panel will examine this claim, and discuss in what circumstances (if any) it is (or could be) true.As a starting point, let us define a natural language interface as an interface to a computer system that allows the user to control the system by English1 commands or queries. Sometimes the output seen by the user will also be in natural language. Currently, most natural language interfaces only accept typed, rather than spoken, input. Also, such interfaces typically can only handle input related to the restricted world of their underlying application, and moreover, only a subset (albeit expressively comprehensive) of that. Set against these advantages are the following standardly cited disadvantages:verboseness: English commands or queries can take many more keystrokes to enter than equivalent formal command lines or menu-based selection.coverage restrictions: Since current natural language interfaces cannot handle all natural language inputs, not even all those relevant to their domain of discourse, the user is faced with the task of learning what the system can and cannot deal with, usually by trial and error.Given these conflicting arguments, it seems better to avoid the general question of whether natural language interfaces are Good or Bad. Instead, the panel will concentrate on how the utility of natural language interfaces is affected by the environment (broadly conceived) in which they operate. We will also be concerned with how the utility of specific natural language or other types of interface can be determined in specific circumstances.Factors affecting the utility of natural language interfaces include:Type of user: Natural language interfaces are better suited to novice or casual users rather than expert or frequent users of a system. An expert or frequent user can afford the cost of learning a command language because of the terseness it allows. On the other hand, it may be more economical for a novice or infrequent user to enter a verbose natural language input than to find out the correct terse command line. Combination with other input types: It may be possible to build interfaces which combine natural language and other types of interface in way that retains the best features of both, while reducing the impact of their negative features.The above list of issues does not pretend to be comprehensive, out is intended as a basis for discussion. Many other issues will no doubt arise during the course of the panel.

  • Research Article
  • 10.1145/1165385.317460
The utility of natural language interfaces (panel session)
  • Apr 1, 1985
  • ACM SIGCHI Bulletin
  • Philip J Hayes

Natural language interfaces are frequently proposed as a solution to the problems of “user-unfriendliness” present in many existing computer system interfaces. The panel will examine this claim, and discuss in what circumstances (if any) it is (or could be) true. As a starting point, let us define a natural language interface as an interface to a computer system that allows the user to control the system by English 1 commands or queries. Sometimes the output seen by the user will also be in natural language. Currently, most natural language interfaces only accept typed, rather than spoken, input. Also, such interfaces typically can only handle input related to the restricted world of their underlying application, and moreover, only a subset (albeit expressively comprehensive) of that. Set against these advantages are the following standardly cited disadvantages: verboseness: English commands or queries can take many more keystrokes to enter than equivalent formal command lines or menu-based selection. coverage restrictions: Since current natural language interfaces cannot handle all natural language inputs, not even all those relevant to their domain of discourse, the user is faced with the task of learning what the system can and cannot deal with, usually by trial and error. Given these conflicting arguments, it seems better to avoid the general question of whether natural language interfaces are Good or Bad. Instead, the panel will concentrate on how the utility of natural language interfaces is affected by the environment (broadly conceived) in which they operate. We will also be concerned with how the utility of specific natural language or other types of interface can be determined in specific circumstances. Factors affecting the utility of natural language interfaces include: Type of user: Natural language interfaces are better suited to novice or casual users rather than expert or frequent users of a system. An expert or frequent user can afford the cost of learning a command language because of the terseness it allows. On the other hand, it may be more economical for a novice or infrequent user to enter a verbose natural language input than to find out the correct terse command line. Combination with other input types: It may be possible to build interfaces which combine natural language and other types of interface in way that retains the best features of both, while reducing the impact of their negative features. The above list of issues does not pretend to be comprehensive, out is intended as a basis for discussion. Many other issues will no doubt arise during the course of the panel.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icsc.2020.00054
A Compositional Semantics for a Wide-Coverage Natural-Language Query Interface to a Semantic Web Triplestore
  • Feb 1, 2020
  • Shane Peelar + 1 more

Many Natural Language (NL) Query Interfaces to data stores convert queries to a formal query language and then execute the formal query in order to obtain the result. This is problematic when handling chained prepositional phrases. An alternative approach is to treat the NL query language as a formal language and to execute the NL query directly with respect to the data store. This approach can accommodate a wide range of NL queries and is relatively easy to implement if it is based on an extension of Richard Montague's denotational semantics (MS) for natural language. The higher-order functional capability of the programming language Haskell facilitates both the implementation of MS and the close integration of syntactic analysis with semantic processing. A publicly accessible web-based NL interface to a remote Semantic Web data store has been constructed to demonstrate the viability of this approach. The approach can be directly adapted for use with relational databases.

  • Conference Article
  • Cite Count Icon 29
  • 10.5555/636669.636676
Conceptual information retrieval
  • Jun 23, 1980
  • Roger C Schank + 2 more

: If we want to build intelligent information retrieval systems, we will have to give them the capabilities of understanding Natural language, automatically organizing and reorganizing their memories, and using intelligent heuristics for searching their memories. These systems will have to analyze and understand both new text and Natural Language queries. In answering questions, they will have to direct memory search to reasonable places. This requires good organization of both the conceptual content of text and knowledge necessary for understanding those texts and accessing memory. The CYRUS and FRUMP systems (Kolodner (1978), Schank and Kolodner (1979), Dejong (1979)) comprise an information retrieval system called CyFr. Together, they have the analysis and retrieval capabilities mentioned above. FRUMP analysis news stories from the UPI wire for their conceptual content, and produces summaries of those stories. It sends summaries of stories about important people to CYRUS automatically adds those stories to its memory, and can then retrieve that information to answer question posed to it in natural language. This paper describes the problems involved in building such an intelligent system. It proposes solutions to some of those problems bases on recent research in Artificial Intelligence and Natural Language processing, and describes the CyFr system, which implements those solutions. The solutions we propose and implement are based on a model of human understanding and memory retrieval. (Author)

  • Conference Article
  • Cite Count Icon 35
  • 10.2312/eurovisshort.20171133
Natural language interfaces for data analysis with visualization: considering what has and could be asked
  • Jun 12, 2017
  • Arjun Srinivasan + 1 more

Natural language is emerging as a promising interaction paradigm for data analysis with visualization. Designing and implementing Natural Language Interfaces (NLIs) is a challenging task, however. In addition to being able to process and understand natural language expressions, NLIs for data visuailzation must consider other factors including input modalities, providing input affordances, and explaining system results, among others. In this article, we examine existing NLIs for data analysis with visualization, and compare and contrast them based on the tasks they allow people to perform. We discuss open research opportunities and themes for emerging NLIs in the visualization community. We also provide examples from the existing literature in the broader HCI community that may help explore some of the highlighted themes for future work. Our goal is to assist readers to understand the subtleties and challenges in designing NLIs and encourage the community to think further about NLIs for data analysis with visualization.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.