CONTEXTUAL LANGUAGE MODELS FOR RANKING ANSWERS TO NATURAL LANGUAGE DEFINITION QUESTIONS

Alejandro Figueroa,John Atkinson

doi:10.1111/j.1467-8640.2012.00426.x

Abstract

Question–answering systems make good use of knowledge bases (KBs, e.g.,Wikipedia) for responding to definition queries. Typically, systems extract relevant facts from articles regarding the question across KBs, and then they are projected into the candidate answers. However, studies have shown that the performance of this kind of method suddenly drops, whenever KBs supply narrow coverage. This work describes a new approach to deal with this problem by constructingcontext modelsfor scoring candidate answers, which are, more precisely, statisticaln‐gram language models inferred fromlexicalized dependency pathsextracted fromWikipediaabstracts. Unlike state‐of‐the‐art approaches,context modelsare created by capturing the semantics of candidate answers (e.g., “novel,”“singer,”“coach,” and “city”). This work is extended by investigating the impact oncontext modelsof extra linguistic knowledge such as part‐of‐speech tagging and named‐entity recognition. Results showed the effectiveness ofcontext modelsasn‐gram lexicalized dependency paths and promising context indicators for the presence of definitions in natural language texts.

Full Text