Improving FAQ Retrieval Using Query Log Clustering in Latent Semantic Space

Harksoo Kim,Jungyun Seo,Hyunjung Lee

doi:10.1007/11562382_18

Abstract

AbstractLexical disagreement problems often occur in FAQ retrieval because FAQs unlike general documents consist of just one or two sentences. To resolve lexical disagreement problems, we propose a high-performance FAQ retrieval system using query log clustering. During indexing time, using latent semantic analysis techniques, the proposed system classifies and groups the logs of users’ queries into predefined FAQ categories. During retrieval time, the proposed system uses the query log clusters as a form of FAQ smoothing. In our experiment, we found that the proposed system could resolve some lexical disagreement problems between queries and FAQs.KeywordsCosine SimilarityLatent Semantic AnalysisVector Space ModelContent WordTerm WeightThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text