Query-based summarization of discussion threads

Suzan Verberne,Sander Wubben,Emiel Krahmer,Antal Van Den Bosch

doi:10.1017/s1351324919000123

Abstract

AbstractIn this paper, we address query-based summarization of discussion threads. New users can profit from the information shared in the forum, Please check if the inserted city and country names in the affiliations are correct. if they can find back the previously posted information. However, discussion threads on a single topic can easily comprise dozens or hundreds of individual posts. Our aim is to summarize forum threads given real web search queries. We created a data set with search queries from a discussion forum’s search engine log and the discussion threads that were clicked by the user who entered the query. For 120 thread–query combinations, a reference summary was made by five different human raters. We compared two methods for automatic summarization of the threads: a query-independent method based on post features, and Maximum Marginal Relevance (MMR), a method that takes the query into account. We also compared four different word embeddings representations as alternative for standard word vectors in extractive summarization. We find (1) that the agreement between human summarizers does not improve when a query is provided that: (2) the query-independent post features as well as a centroid-based baseline outperform MMR by a large margin; (3) combining the post features with query similarity gives a small improvement over the use of post features alone; and (4) for the word embeddings, a match in domain appears to be more important than corpus size and dimensionality. However, the differences between the models were not reflected by differences in quality of the summaries created with help of these models. We conclude that query-based summarization with web queries is challenging because the queries are short, and a click on a result is not a direct indicator for the relevance of the result.

Highlights

User-generated content in online forum communities is a valuable source of information
Automatic summarization can pivot information finding in long threads by reducing a thread to only the most important information, which can be helpful for patient communities, and for many other kinds of discussion forums
We evaluate Maximum Marginal Relevance (MMR) for query-dependent extractive summarization of discussion threads using short user queries, and compare it to a common query-independent method based on generic post features such as length, position, and centrality

Summary

Introduction

User-generated content in online forum communities is a valuable source of information. It has been shown that patients are better informed if they participate in online patient communities (van Uden-Kraan et al 2009) This is true for patients who post messages themselves and for “lurkers” (i.e., forum users who do not post but only read) (van UdenKraan et al 2008). Discussion threads on a single topic can comprise dozens or hundreds of individual posts, which makes it difficult to find the relevant information in the thread (Bhatia and Mitra 2010). This has motivated the development of text mining methods for disclosing the information in forum communities, combining free text search with information extraction and summarization (van Oortmerssen et al 2017). Automatic summarization can pivot information finding in long threads by reducing a thread to only the most important information, which can be helpful for patient communities, and for many other kinds of discussion forums

Objectives

Methods

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Natural Language Engineering	Publication Date: Apr 16, 2019
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Query-based summarization of discussion threads

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Language Engineering

Lead the way for us

Similar Papers

Diversity driven attention model for query-based abstractive summarization
Preksha Nema ... Mitesh M Khapra
-
Preksha Nema, et. al.Preksha Nema ... Mitesh M Khapra
01 Jan 2017
01 Jan 2017

Cross-Task Knowledge Transfer for Query-Based Text Summarization
Elozino Egonmwan ... Vittorio Castelli
-
Elozino Egonmwan, et. al.Elozino Egonmwan ... Vittorio Castelli
01 Jan 2019
01 Jan 2019

Unsupervised Broadcast News Summarization; a Comparative Study on Maximal Marginal Relevance (MMR) and Latent Semantic Analysis (LSA)
Majid Ramezani ... Mohammad-Salar Shahryari
-
Majid Ramezani, et. al.Majid Ramezani ... Mohammad-Salar Shahryari
25 Jan 2023
25 Jan 2023

Query-based summarization for Indonesian news articles
Dininta Annisa ... Masayu Leylia Khodra
-
Dininta Annisa, et. al.Dininta Annisa ... Masayu Leylia Khodra
01 Aug 2017
01 Aug 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Query-based summarization of discussion threads

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Language Engineering