Abstract
Relation classification is crucial for inferring semantic relatedness between entities in a piece of text. These systems can be trained given labelled data. However, relation classification is very domain-specific and it takes a lot of effort to label data for a new domain. In this paper, we explore domain adaptation techniques for this task. While past works have focused on single source domain adaptation for bio-medical relation classification, we classify relations in an unlabeled target domain by transferring useful knowledge from one or more related source domains. Our experiments with the model have shown to improve state-of-the-art F1 score on 3 benchmark biomedical corpora for single domain and on 2 out of 3 for multi-domain scenarios. When used with contextualized embeddings, there is further boost in performance outperforming neural-network based domain adaptation baselines for both the cases.
Highlights
IntroductionA relation can exist between various entity types like protein-protein, drug-drug, chemical-protein etc
In the biomedical domain, a relation can exist between various entity types like protein-protein, drug-drug, chemical-protein etc
We explore multi-source single target (MSST) adaptation to incorporate more richness in the knowledge transferred by using additional smaller corpora for protein-protein relation and multiple labels for chemical-protein relation respectively
Summary
A relation can exist between various entity types like protein-protein, drug-drug, chemical-protein etc. Given an unlabeled target domain, we transfer common useful features from related labelled source domains using adversarial training (Goodfellow et al, 2014). It helps to overcome the sampling bias and learn common indistinguishable features, promoting generalization, using min-max optimization. We adopt the Multinomial Adversarial Network integrated with the Shared-Private model (Chen and Cardie, 2018) which was originally proposed for the task of Multi-Domain Text Classification. It can handle multiple source domains at a time which is in contrast to traditional binomial adversarial networks. 2) We explore the generalizability of our framework using two prominent neural architectures: CNN (Nguyen and Grishman, 2015) and BiLSTM (Kavuluru et al, 2017), where we find the former to be more robust across our experiments
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.