Semantic Question Answering Research Articles

Extractive methods for machine reading comprehension (MRC) tasks have achieved comparable or better accuracy than human performance on benchmark data sets. However, such models are not as successful when adapted to complex domains such as health care. One of the main reasons is that the context that the MRC model needs to process when operating in a complex domain can be much larger compared with an average open-domain context. This causes the MRC model to make less accurate and slower predictions. A potential solution to this problem is to reduce the input context of the MRC model by extracting only the necessary parts from the original context. This study aims to develop a method for extracting useful contexts from long articles as an additional component to the question answering task, enabling the MRC model to work more efficiently and accurately. Existing approaches to context extraction in MRC are based on sentence selection strategies, in which the models are trained to find the sentences containing the answer. We found that using only the sentences containing the answer was insufficient for the MRC model to predict correctly. We conducted a series of empirical studies and observed a strong relationship between the usefulness of the context and the confidence score output of the MRC model. Our investigation showed that a precise input context can boost the prediction correctness of the MRC and greatly reduce inference time. We proposed a method to estimate the utility of each sentence in a context in answering the question and then extract a new, shorter context according to these estimations. We generated a data set to train 2 models for estimating sentence utility, based on which we selected more precise contexts that improved the MRC model's performance. We demonstrated our approach on the Question Answering Data Set for COVID-19 and Biomedical Semantic Indexing and Question Answering data sets and showed that the approach benefits the downstream MRC model. First, the method substantially reduced the inference time of the entire question answering system by 6 to 7 times. Second, our approach helped the MRC model predict the answer more correctly compared with using the original context (F1-score increased from 0.724 to 0.744 for the Question Answering Data Set for COVID-19 and from 0.651 to 0.704 for the Biomedical Semantic Indexing and Question Answering). We also found a potential problem where extractive transformer MRC models predict poorly despite being given a more precise context in some cases. The proposed context extraction method allows the MRC model to achieve improved prediction correctness and a significantly reduced MRC inference time. This approach works technically with any MRC model and has potential in tasks involving processing long texts.

Read full abstract

The ability to accurately extract key information from textual documents is necessary in several downstream applications e.g., automatic knowledge base population from text, semantic information retrieval, question answering, or text summarization. However, information extraction (IE) systems are far from being errorless and in some cases commit errors that seem obvious to a human expert as they violate common sense or domain knowledge.Towards improving the performance of IE systems, we focus on the question of how domain knowledge can be incorporated into IE models to reduce the number of spurious extractions. Starting from the assumption that such domain knowledge cannot be incorporated explicitly and manually by domain experts due to the amount of effort and technical complexities involved, we propose a machine learning approach in which domain constraints are acquired as a byproduct of learning a model that learns to extract key information in a supervised setting. We frame the task as a template-based information extraction problem in which several dependent slots need to be automatically filled and propose a factor graph based approach to model the joint distribution of slot assignments given a text. Beyond using standard textual features in factors that score the compatibility of slot fillers in relation to the text, we use additional features that are text-independent and capture soft domain constraints. During the training process, these constraints receive a weight as part of the parameter learning process indicating how strongly a constraint should be enforced. These domain constraints are thus ‘soft’ in the sense that they can be violated, but the system learns to penalize solutions that violate them. The soft constraints we introduce come in two flavors: on the one hand we incorporate information about the mean of numerical attributes and use features that indicate how far a certain value is from the mean. We call these features single slot soft constraints. On the other hand, we model the pairwise compatibility between slot filler assignments independent of the textual context, thus modeling the (domain) compatibility of the slot assignments. We call the latter ones pairwise slot soft constraints. As main result of our work, we show that learning pairwise slot soft constraints improves the performance of our extraction model compared to single slot soft constraints by up to 6 points in F1, leading to an F1 score of 0.91 for individual template types. Further, the human readable output format of our model enables the extraction and interpretation of the learned soft constraints. Based on this, we show in an evaluation by domain experts that more than 68% of the learned soft constraints are regarded as plausible.

Read full abstract

Semantic Question Answering Research Articles

Related Topics

Articles published on Semantic Question Answering

Efficient Machine Reading Comprehension for Health Care Applications: Algorithm Development and Validation of a Context Extraction Approach.

Discovery and recognition of formula concepts using machine learning

A Question-Answering Model Based on Knowledge Graphs for the General Provisions of Equipment Purchase Orders for Steel Plants Maintenance

Developing a Semantic Question Answering System for E-learning Environments using Linguistic Resources

Discerning appropriate reviews based on hierarchical deep neural network for answering product-related queries

A Question Answering Framework Based on Hybridization of Deep Learning and Semantic Web Techniques

Literature Review on Trends of Comprehension Instruction for Elementary School Students

데이터 증강 기법을 이용한 한글 개체명 인식

A survey on semantic question answering systems

COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization

Review and Analysis of Different Approaches to Semantic Level Question Answering and Information Retrieval

Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

A comparative review on deep learning models for text classification

IQA: Interactive query construction in semantic question answering systems

Math-word embedding in math search and semantic extraction

A Bilinear Ranking SVM for Knowledge Based Relation Prediction and Classification

SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.

Learning soft domain constraints in a factor graph model for template-based information extraction

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

New instances classification framework on Quran ontology applied to question answering system

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semantic Question Answering Research Articles

Related Topics

Articles published on Semantic Question Answering

Efficient Machine Reading Comprehension for Health Care Applications: Algorithm Development and Validation of a Context Extraction Approach.

Discovery and recognition of formula concepts using machine learning

A Question-Answering Model Based on Knowledge Graphs for the General Provisions of Equipment Purchase Orders for Steel Plants Maintenance

Developing a Semantic Question Answering System for E-learning Environments using Linguistic Resources

Discerning appropriate reviews based on hierarchical deep neural network for answering product-related queries

A Question Answering Framework Based on Hybridization of Deep Learning and Semantic Web Techniques

Literature Review on Trends of Comprehension Instruction for Elementary School Students

데이터 증강 기법을 이용한 한글 개체명 인식

A survey on semantic question answering systems

COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization

Review and Analysis of Different Approaches to Semantic Level Question Answering and Information Retrieval

Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

A comparative review on deep learning models for text classification

IQA: Interactive query construction in semantic question answering systems

Math-word embedding in math search and semantic extraction

A Bilinear Ranking SVM for Knowledge Based Relation Prediction and Classification

SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.

Learning soft domain constraints in a factor graph model for template-based information extraction

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

New instances classification framework on Quran ontology applied to question answering system