Abstract

Implicit discourse relation recognition is a serious challenge in discourse analysis, which aims to understand and annotate the latent relations between two discourse arguments, such as temporal and comparison. Most neural network-based models encode linguistic features (such as syntactic parsing and position information) as embedding vectors, which are prone to error propagation due to unsuitable pre-processing. Other methods apply different attention or memory mechanisms, mainly considering the key points in the discourse, yet ignore some valuable clues. In particular, those using convolution neural networks retain local contexts but lose word order information due to the standard pooling operation. The methods that use bidirectional long short-term memory network consider the word sequence and retain the global information, but cannot capture the context with different range sizes. In this paper, we propose a novel D ynamic C hunk-based Max Pooling B iLSTM- CNN framework ( DC-BCNN ) to address these issues. First, we exploit BiLSTMs to capture the semantic representations of discourse arguments. Second, we adopt the proposed convolutional layer to automatically extract the “multi-granularity” features (just like n-gram) by setting different convolution filter sizes. Then, we design a dynamic chunk-based max pooling strategy to obtain the important scaled features of different parts in one discourse argument. This strategy can dynamically divide each argument into several segments (called chunks) according to the argument length and the number of current pooling layer in the CNN and then select the maximum value of each chunk to indicate crucial information. We further utilize a fully connected layer with a softmax function to recognize discourse relations. The experimental results on two corpora (i.e., PDTB and HIT-CDTB) show that our proposed model is effective in implicit discourse relation recognition.

Highlights

  • Discourse relation describes how two adjacent text units are connected logically, which can capture essential structural and semantic aspects of a discourse

  • We propose a novel Dynamic Chunk-based Max Pooling bidirectional LSTM (BiLSTM)-convolutional neural networks (CNNs) framework (DC-BCNN) framework for implicit discourse relation recognition, which can automatically induce the semantic understanding from the wider ranges of n-grams and reserve more valid information without complicated natural language processing (NLP) pre-processing;

  • Similar to the results of ablation experiments on the Penn Discourse TreeBank3 (PDTB), we find that the performance on the F1 score of neural network-based methods is better than that of the traditional support vector machine (SVM) model

Read more

Summary

INTRODUCTION

Discourse relation describes how two adjacent text units (e.g., clauses, sentences, and larger sentence groups) are connected logically, which can capture essential structural and semantic aspects of a discourse. There are two questions that arise: how can discourse arguments be modelled properly and how can the interactions between arguments be captured For these issues, considerable researches have been performed for implicit discourse relation recognition with the use of traditional NLP linguistically informed features and machine learning algorithms [6]–[9]. The convolutional neural networks typically utilize a standard max pooling layer that applies a max operation over a feature map to capture the most useful information This operation may lose valuable facts such as word order information which is help identify implicit discourse relation.

THE PROPOSED APPROACH
DISCOURSE ARGUMENT REPRESENTATION
MODEL TRAINING
RESULTS AND DISCUSSION
RELATED WORK
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call