QuAX

Muhammad Shihab Rashid,Vagelis Hristidis,Fuad Jamour

doi:10.1145/3459637.3482289

Abstract

Frequently Asked Questions (FAQ) are a form of semi-structured data that provides users with commonly requested information and enables several natural language processing tasks. Given the plethora of such question-answer pairs on the Web, there is an opportunity to automatically build large FAQ collections for any domain, such as COVID-19 or Plastic Surgery. These collections can be used by several information-seeking portals and applications, such as AI chatbots. Automatically identifying and extracting such high-utility question-answer pairs is a challenging endeavor, which has been tackled by little research work. For a question-answer pair to be useful to a broad audience, it must (i) provide general information -- not be specific to the Web site or Web page where it is hosted -- and (ii) must be self-contained -- not have references to other entities in the page or missing terms (ellipses) that render the question-answer pair ambiguous. Although identifying general, self-contained questions may seem like a straightforward binary classification problem, the limited availability of training data for this task and the countless domains make building machine learning models challenging. Existing efforts in extracting FAQs from the Web typically focus on FAQ retrieval without much regard to the utility of the extracted FAQ. We propose QuAX: a framework for extracting high-utility (i.e., general and self-contained) domain-specific FAQ lists from the Web. QuAX receives a set of keywords from a user, and works in a pipelined fashion to find relevant web pages and extract general and self-contained questions-answer pairs. We experimentally show how QuAX generates high-utility FAQ collections with little and domain-agnostic training data, and how the individual stages of the pipeline improve on the corresponding state-of-the-art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

QuAX

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

FAQtory: A framework to provide high-quality FAQ retrieval systems
A Moreo ... J.M Zurita
Expert Systems With Applications | VOL. 39
A Moreo, et. al.A Moreo ... J.M Zurita
24 Feb 2012
Expert Systems With Applications | VOL. 39

SMSFR: SMS-Based FAQ Retrieval System
Partha Pakray ... Santanu Pal
-
Partha Pakray, et. al.Partha Pakray ... Santanu Pal
01 Jan 2013
01 Jan 2013

Paraphrase-focused learning to rank for domain-specific frequently asked questions retrieval
Mladen Karan ... Jan Šnajder
Expert Systems With Applications | VOL. 91
Mladen Karan, et. al.Mladen Karan ... Jan Šnajder
12 Sep 2017
Expert Systems With Applications | VOL. 91

Using rough set theory to construct e-learning faq retrieval infrastructure
Deng-Yiv Chiu ... Wen-Chih Chang
-
Deng-Yiv Chiu, et. al. Deng-Yiv Chiu ... Wen-Chih Chang
01 Jul 2008
01 Jul 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

QuAX

Abstract

Talk to us

Similar Papers