Abstract

Why-type non-factoid questions are ambiguous and involve variations in their answers. A challenge in returning one appropriate answer to user requires the process of appropriate answer extraction, re-ranking and validation. There are cases where the need is to understand the meaning and context of a document rather than finding exact words involved in question. The paper addresses this problem by exploring lexico-syntactic, semantic and contextual query-dependent features, some of which are based on deep learning frameworks to depict the probability of answer candidate being relevant for the question. The features are weighted by the score returned by ensemble ExtraTreesClassifier according to features importance. An answer re-ranker model is implemented that finds the highest ranked answer comprising largest value of feature similarity between question and answer candidate and thus achieving 0.64 Mean Reciprocal Rank (MRR). Further, answer is validated by matching the answer type of answer candidate and returns the highest ranked answer candidate with matched answer type to a user.

Highlights

  • The advent of IBM’s Watson (IBM Watson, 2020) has shown remarkable results in answering opendomain questions

  • There are cases where the need is to understand the meaning and context of a document rather than finding exact words involved in question. The paper addresses this problem by exploring lexico-syntactic, semantic, and contextual query-dependent features, some of which are based on deep learning frameworks to depict the probability of answer candidate being relevant for the question

  • Various features covering lexical-syntactic, semantic and contextual similarities have been employed to find the relevancy of each answer candidate to a question

Read more

Summary

INTRODUCTION

The advent of IBM’s Watson (IBM Watson, 2020) has shown remarkable results in answering opendomain questions. Research in question answering domain has achieved high accuracy around 85% in answering factoid-type questions. Some of the work from Verberne et al (2010), Jansen and Surdeanu (2014), Fried and Jansen (2015), Oh et al (2012, 2013) has been successful in answering open-domain non-factoid questions whereas Tran and Niederee (2018) has investigated deep learning frameworks for answering insurance and financial domain non-factoid questions but still performance is lower than factoid QAS such as IBM Watson. An answer re-ranker is developed exploring the set of features based on similarity between question and answer candidates, weighted by feature importance scores. The method is able to achieve 0.64 Mean Reciprocal Rank (MRR) which significantly improves over other previous research works in why-type answer re-ranker.

BACKGROUND
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call