Abstract
Considering the expensive annotation in Named Entity Recognition (NER ), Cross-domain NER enables NER in low-resource target domains with few or without labeled data, by transferring the knowledge of high-resource domains. However, the discrepancy between different domains causes the domain shift problem and hampers the performance of cross-domain NER in low-resource scenarios. In this article, we first propose an adversarial adaptive augmentation, where we integrate the adversarial strategy into a multi-task learner to augment and qualify domain adaptive data. We extract domain-invariant features of the adaptive data to bridge the cross-domain gap and alleviate the label-sparsity problem simultaneously. Therefore, another important component in this article is the progressive domain-invariant feature distillation framework. A multi-grained MMD (Maximum Mean Discrepancy) approach in the framework to extract the multi-level domain invariant features and enable knowledge transfer across domains through the adversarial adaptive data. Advanced Knowledge Distillation (KD) schema processes progressively domain adaptation through the powerful pre-trained language models and multi-level domain invariant features. Extensive comparative experiments over four English and two Chinese benchmarks show the importance of adversarial augmentation and effective adaptation from high-resource domains to low-resource target domains. Comparison with two vanilla and four latest baselines indicates the state-of-the-art performance and superiority confronted with both zero-resource and minimal-resource scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ACM Transactions on Asian and Low-Resource Language Information Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.