Sentence Compression Using BERT and Graph Convolutional Networks

Yo-Han Park,Kong-Joo Lee,Yong-Seok Choi,Gyong-Ho Lee

doi:10.3390/app11219910

Abstract

Sentence compression is a natural language-processing task that produces a short paraphrase of an input sentence by deleting words from the input sentence while ensuring grammatical correctness and preserving meaningful core information. This study introduces a graph convolutional network (GCN) into a sentence compression task to encode syntactic information, such as dependency trees. As we upgrade the GCN to activate a directed edge, the compression model with the GCN layers can distinguish between parent and child nodes in a dependency tree when aggregating adjacent nodes. Furthermore, by increasing the number of GCN layers, the model can gradually collect high-order information of a dependency tree when propagating node information through the layers. We implement a sentence compression model for Korean and English, respectively. This model consists of three components: pre-trained BERT model, GCN layers, and a scoring layer. The scoring layer can determine whether a word should remain in a compressed sentence by relying on the word vector containing contextual and syntactic information encoded by BERT and GCN layers. To train and evaluate the proposed model, we used the Google sentence compression dataset for English and a Korean sentence compression corpus containing about 140,000 sentence pairs for Korean. The experimental results demonstrate that the proposed model achieves state-of-the-art performance for English. To the best of our knowledge, this sentence compression model based on the deep learning model trained with a large-scale corpus is the first attempt for Korean.

Highlights

Sentence compression is a natural language-processing (NLP) task where the primary objective is to generate a short paraphrase of an input sentence [1]
The compression ratio (CR) is the average number of characters of a compressed sentence divided by that of an original sentence, and MC is the CR of sentences compressed by a model minus the CR of a training set
This study introduced d-graph convolutional network (GCN) into a sentence compression task to represent a word with a dependency tree

Summary

Introduction

Sentence compression is a natural language-processing (NLP) task where the primary objective is to generate a short paraphrase of an input sentence [1]. Deletion-based compression is the most common approach to sentence compression—it is a sequence-labeling problem that determines whether each word should be retained or deleted from a source sentence to form a compressed sentence [2]. With this approach, only the remaining words are used to compose a compressed sentence. Only the remaining words are used to compose a compressed sentence Syntactic information, such as a syntactic tree, has been adopted as a feature in many sentence compression systems to maintain the grammaticality of a compressed sentence. The GNN has recently gained popularity in the dependency parsing research domain because it can appropriately represent a node of a dependency tree by aggregating its parent and child nodes and by gradually incorporating higher-order information of a dependency tree by collecting neighboring nodes

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Oct 23, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Sentence Compression Using BERT and Graph Convolutional Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
Mounika Marreddy ... Radhika Mamidi
-
Mounika Marreddy, et. al.Mounika Marreddy ... Radhika Mamidi
18 Jul 2022
18 Jul 2022

Summarization with a joint model for sentence extraction and compression
André F T Martins ... Noah A Smith
-
André F T Martins, et. al.André F T Martins ... Noah A Smith
01 Jan 2009
01 Jan 2009

A Deep-Learning Approach to Single Sentence Compression
Deepak Sahoo ... Rakesh Chandra Balabantaray
-
Deepak Sahoo, et. al.Deepak Sahoo ... Rakesh Chandra Balabantaray
01 Jan 2021
01 Jan 2021

SC-DGCN: Sentiment Classification Based on Densely Connected Graph Convolutional Network
Renhao Zhao ... Chao Chen
-
Renhao Zhao, et. al.Renhao Zhao ... Chao Chen
26 Feb 2021
26 Feb 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sentence Compression Using BERT and Graph Convolutional Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences