Memorizing All for Implicit Discourse Relation Recognition
Implicit discourse relation recognition is a challenging task due to the absence of the necessary informative clues from explicit connectives. An implicit discourse relation recognizer has to carefully tackle the semantic similarity of sentence pairs and the severe data sparsity issue. In this article, we learn token embeddings to encode the structure of a sentence from a dependency point of view in their representations and use them to initialize a baseline model to make it really strong. Then, we propose a novel memory component to tackle the data sparsity issue by allowing the model to master the entire training set, which helps in achieving further performance improvement. The memory mechanism adequately memorizes information by pairing representations and discourse relations of all training instances, thus filling the slot of the data-hungry issue in the current implicit discourse relation recognizer. The proposed memory component, if attached with any suitable baseline, can help in performance enhancement. The experiments show that our full model with memorizing the entire training data provides excellent results on PDTB and CDTB datasets, outperforming the baselines by a fair margin.
- Research Article
16
- 10.1145/3028772
- Mar 17, 2017
- ACM Transactions on Asian and Low-Resource Language Information Processing
Discourse relations between two text segments play an important role in many Natural Language Processing (NLP) tasks. The connectives strongly indicate the sense of discourse relations, while in fact, there are no connectives in a large proportion of discourse relations, that is, implicit discourse relations. Compared with explicit relations, implicit relations are much harder to detect and have drawn significant attention. Until now, there have been many studies focusing on English implicit discourse relations, and few studies address implicit relation recognition in Chinese even though the implicit discourse relations in Chinese are more common than those in English. In our work, both the English and Chinese languages are our focus. The key to implicit relation prediction is to properly model the semantics of the two discourse arguments, as well as the contextual interaction between them. To achieve this goal, we propose a neural network based framework that consists of two hierarchies. The first one is the model hierarchy, in which we propose a max-margin learning method to explore the implicit discourse relation from multiple views. The second one is the feature hierarchy, in which we learn multilevel distributed representations from words, arguments, and syntactic structures to sentences. We have conducted experiments on the standard benchmarks of English and Chinese, and the results show that compared with several methods our proposed method can achieve the best performance in most cases.
- Conference Article
17
- 10.3115/v1/w14-4320
- Jan 1, 2014
In this paper we address the problem of skewed class distribution in implicit discourse relation recognition. We examine the performance of classifiers for both binary classification predicting if a particular relation holds or not and for multi-class prediction. We review prior work to point out that the problem has been addressed differently for the binary and multi-class problems. We demonstrate that adopting a unified approach can significantly improve the performance of multi-class prediction. We also propose an approach that makes better use of the full annotations in the training set when downsampling is used. We report significant absolute improvements in performance in multi-class prediction, as well as significant improvement of binary classifiers for detecting the presence of implicit Temporal, Comparison and Contingency relations.
- Research Article
91
- 10.1186/1471-2105-12-188
- May 23, 2011
- BMC Bioinformatics
BackgroundIdentification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has been made to develop such an annotated resource.ResultsWe have developed the Biomedical Discourse Relation Bank (BioDRB), in which we have annotated explicit and implicit discourse relations in 24 open-access full-text biomedical articles from the GENIA corpus. Guidelines for the annotation were adapted from the Penn Discourse TreeBank (PDTB), which has discourse relations annotated over open-domain news articles. We introduced new conventions and modifications to the sense classification. We report reliable inter-annotator agreement of over 80% for all sub-tasks. Experiments for identifying the sense of explicit discourse connectives show the connective itself as a highly reliable indicator for coarse sense classification (accuracy 90.9% and F1 score 0.89). These results are comparable to results obtained with the same classifier on the PDTB data. With more refined sense classification, there is degradation in performance (accuracy 69.2% and F1 score 0.28), mainly due to sparsity in the data. The size of the corpus was found to be sufficient for identifying the sense of explicit connectives, with classifier performance stabilizing at about 1900 training instances. Finally, the classifier performs poorly when trained on PDTB and tested on BioDRB (accuracy 54.5% and F1 score 0.57).ConclusionOur work shows that discourse relations can be reliably annotated in biomedical text. Coarse sense disambiguation of explicit connectives can be done with high reliability by using just the connective as a feature, but more refined sense classification requires either richer features or more annotated data. The poor performance of a classifier trained in the open domain and tested in the biomedical domain suggests significant differences in the semantic usage of connectives across these domains, and provides robust evidence for a biomedical sublanguage for discourse and the need to develop a specialized biomedical discourse annotated corpus. The results of our cross-domain experiments are consistent with related work on identifying connectives in BioDRB.
- Research Article
16
- 10.1109/access.2019.2954988
- Jan 1, 2019
- IEEE Access
Implicit discourse relation recognition is a serious challenge in discourse analysis, which aims to understand and annotate the latent relations between two discourse arguments, such as temporal and comparison. Most neural network-based models encode linguistic features (such as syntactic parsing and position information) as embedding vectors, which are prone to error propagation due to unsuitable pre-processing. Other methods apply different attention or memory mechanisms, mainly considering the key points in the discourse, yet ignore some valuable clues. In particular, those using convolution neural networks retain local contexts but lose word order information due to the standard pooling operation. The methods that use bidirectional long short-term memory network consider the word sequence and retain the global information, but cannot capture the context with different range sizes. In this paper, we propose a novel D ynamic C hunk-based Max Pooling B iLSTM- CNN framework ( DC-BCNN ) to address these issues. First, we exploit BiLSTMs to capture the semantic representations of discourse arguments. Second, we adopt the proposed convolutional layer to automatically extract the “multi-granularity” features (just like n-gram) by setting different convolution filter sizes. Then, we design a dynamic chunk-based max pooling strategy to obtain the important scaled features of different parts in one discourse argument. This strategy can dynamically divide each argument into several segments (called chunks) according to the argument length and the number of current pooling layer in the CNN and then select the maximum value of each chunk to indicate crucial information. We further utilize a fully connected layer with a softmax function to recognize discourse relations. The experimental results on two corpora (i.e., PDTB and HIT-CDTB) show that our proposed model is effective in implicit discourse relation recognition.
- Conference Article
36
- 10.18653/v1/p19-1411
- Jan 1, 2019
It has been shown that implicit connectives can be exploited to improve the performance of the models for implicit discourse relation recognition (IDRR). An important property of the implicit connectives is that they can be accurately mapped into the discourse relations conveying their functions. In this work, we explore this property in a multi-task learning framework for IDRR in which the relations and the connectives are simultaneously predicted, and the mapping is leveraged to transfer knowledge between the two prediction tasks via the embeddings of relations and connectives. We propose several techniques to enable such knowledge transfer that yield the state-of-the-art performance for IDRR on several settings of the benchmark dataset (i.e., the Penn Discourse Treebank dataset).
- Conference Article
38
- 10.18653/v1/d15-1264
- Jan 1, 2015
Many discourse relations are explicitly marked with discourse connectives, and these examples could potentially serve as a plentiful source of training data for recognizing implicit discourse relations. However, there are important linguistic differences between explicit and implicit discourse relations, which limit the accuracy of such an approach. We account for these differences by applying techniques from domain adaptation, treating implicitly and explicitly-marked discourse relations as separate domains. The distribution of surface features varies across these two domains, so we apply a marginalized denoising autoencoder to induce a dense, domain-general representation. The label distribution is also domain-specific, so we apply a resampling technique that is similar to instance weighting. In combination with a set of automatically-labeled data, these improvements eliminate more than 80% of the transfer loss incurred by training an implicit discourse relation classifier on explicitly-marked discourse relations.
- Research Article
5
- 10.1007/s11063-017-9582-x
- Jan 23, 2017
- Neural Processing Letters
Implicit discourse relation recognition aims to discover the semantic relation between two sentences where the discourse connective is absent. Due to the lack of labeled data, previous work tries to generate additional training data automatically by removing discourse connectives from explicit discourse relation instances. However, using these artificial data indiscriminately has been proven to degrade the performance of implicit discourse relation recognition. To address this problem, we propose a co-training approach based on manual features and distributed features, which identifies useful instances from these artificial data to enlarge the labeled data. In addition, the distributed features are learned via recursive autoencoder based approaches, capable of capturing to some extent the semantics of sentences which is valuable for implicit discourse relation recognition. Experiment results on both the PDTB and CDTB data sets indicate that: (1) The learned distributed features are complementary to the manual features, and thus suitable for co-training. (2) Our proposed co-training approach can use these artificial data effectively, and significantly outperforms the baselines.
- Conference Article
15
- 10.18653/v1/2021.emnlp-main.187
- Jan 1, 2021
Implicit discourse relation recognition (IDRR) is a critical task in discourse analysis. Previous studies only regard it as a classification task and lack an in-depth understanding of the semantics of different relations. Therefore, we first view IDRR as a generation task and further propose a method joint modeling of the classification and generation. Specifically, we propose a joint model, CG-T5, to recognize the relation label and generate the target sentence containing the meaning of relations simultaneously. Furthermore, we design three target sentence forms, including the question form, for the generation model to incorporate prior knowledge. To address the issue that large discourse units are hardly embedded into the target sentence, we also propose a target sentence construction mechanism that automatically extracts core sentences from those large discourse units. Experimental results both on Chinese MCDTB and English PDTB datasets show that our model CG-T5 achieves the best performance against several state-of-the-art systems.
- Research Article
5
- 10.1016/j.neucom.2017.02.084
- Mar 8, 2017
- Neurocomputing
Leveraging bilingually-constrained synthetic data via multi-task neural networks for implicit discourse relation recognition
- Video Transcripts
- 10.48448/qgy0-5t40
- Oct 15, 2021
- Underline Science Inc.
Implicit discourse relation recognition (IDRR) is a critical task in discourse analysis. Previous studies only regard it as a classification task and lack an in-depth understanding of the semantics of different relations. Therefore, we first view IDRR as a generation task and further propose a method joint modeling of the classification and generation. Specifically, we propose a joint model, CG-T5, to recognize the relation label and generate the target sentence containing the meaning of relations simultaneously. Furthermore, we design three target sentence forms, including the question form, for the generation model to incorporate prior knowledge. To address the issue that large discourse units are hardly embedded into the target sentence, we also propose a target sentence construction mechanism that automatically extracts core sentences from those large discourse units. Experimental results both on Chinese MCDTB and English PDTB datasets show that our model CG-T5 achieves the best performance against several state-of-the-art systems.
- Research Article
6
- 10.1016/j.neucom.2019.08.081
- Aug 30, 2019
- Neurocomputing
Boosting implicit discourse relation recognition with connective-based word embeddings
- Research Article
19
- 10.3390/e25091294
- Sep 4, 2023
- Entropy
Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic completeness before any attempt to predict the discourse relation. However, word level embedding, widely used in existing works, may lead to a loss of semantics by splitting some phrases that should be treated as complete semantic units. In this article, we proposed three methods to segment a sentence into complete semantic units: a corpus-based method to serve as the baseline, a constituent parsing tree-based method, and a dependency parsing tree-based method to provide a more flexible and automatic way to divide the sentence. The segmented sentence will then be embedded at the level of semantic units so the embeddings could be fed into the IDRR networks and play the same role as word embeddings. We implemented our methods into one of the recent IDRR models to compare the performance with the original version using word level embeddings. Results show that proper embedding level better conserves the semantic information in the sentence and helps to enhance the performance of IDRR models.
- Conference Article
1
- 10.1109/iceiec51955.2021.9463835
- Jun 18, 2021
Discourse relation recognition is an important branch of NLP, which is helpful to solve many NLP downstream tasks. In particular, implicit discourse relation recognition(IDRR) is the focus of many research, but they all ignore the role of connectives in IDRR. In this paper, we propose two methods for using connectives to help IDRR task and achieve better results.
- Conference Article
13
- 10.18653/v1/2020.codi-1.14
- Jan 1, 2020
The PDTB-3 contains many more Implicit discourse relations than the previous PDTB-2. This is in part because implicit relations have now been annotated within sentences as well as between them. In addition, some now co-occur with explicit discourse relations, instead of standing on their own. Here we show that while this can complicate the problem of identifying the location of implicit discourse relations, it can in turn simplify the problem of identifying their senses. We present data to support this claim, as well as methods that can serve as a non-trivial baseline for future state-of-the-art recognizers for implicit discourse relations.
- Research Article
14
- 10.1016/j.jksuci.2014.06.001
- Oct 7, 2014
- Journal of King Saud University - Computer and Information Sciences
Learning explicit and implicit Arabic discourse relations