The input of the multi-document summarization task is usually long and has high redundancy. Encoding multiple documents is a challenge for the Seq2Seq architecture. The way of concatenating multiple documents into a sequence ignores the relation between documents. Attention-based Seq2Seq architectures have slightly improved the cross-document relation modeling for multi-document summarization. However, these methods ignore the relation between sentences, and there is little improvement that can be achieved through the attention mechanism alone. This paper proposes a hierarchical approach to leveraging the relation between words, sentences, and documents for abstractive multi-document summarization. Our model employs the Graph Convolutional Networks (GCN) for capturing the cross-document and cross-sentence relations. The GCN module can enrich semantic representations by generating high-level hidden features. Our model achieves significant improvement over the attention-based baseline, beating the Hierarchical Transformer by 3.4/1.64, 1.92/1.44 ROUGE-1/2 F1 points on the Multi-News and WikiSum datasets, respectively. Experimental results demonstrate that our delivered method brings substantial improvements over several strong baselines.
Read full abstract