Abstract

Text classification is one of the fundamental tasks in natural language processing (NLP), which has been studied for decades and various approaches have been proposed. Unfortunately, text classification is a highly domain-dependent task, a subtle shift between training and testing data distributions can cause catastrophic performance deterioration. Moreover, the availability of massive labeled data varies among different domains in real-world applications. Therefore, it is of great importance to investigate how to improve the classification accuracy of the target domain by leveraging resources from related domains. Multi-domain text classification (MDTC) is proposed to address the above problem. Nowadays, the mainstream MDTC approaches resort to transfer learning techniques to reduce domain divergence across different domains. In particular, these methods adopt adversarial training and shared-private paradigm to implement domain alignment, yielding state-of-the-art performance. Adversarial learning can reduce domain divergence through a minimax optimization to produce domain-invariant features. The domain-invariant features are supposed to be both transferable and discriminative, while shared-private employs domain-specific features to boost the discriminability of the domain-invariant features. In this thesis, we make several contributions to advance MDTC: First, we propose a dual adversarial co-learning method that utilizes two forms of adversarial training to refine domain-variant features. Second, we apply mixup to conduct the category and domain regularizations to enrich the intrinsic features in the shared latent space and enforce the consistent predictions in-between training samples such that the learned features can be more transferable and discriminative. Third, we analyze the limitation of the adversarial alignment on marginal distributions and propose a novel conditional adversarial network that aligns joint distributions of domain-invariant features and label predictions. Fourth, we propose a co-regularized adversarial learning framework that constructs two diverse adversarial training streams and aligns multiple conditional distributions by penalizing the disagreements of outputs of these two streams. We also incorporate entropy minimization and virtual adversarial training to avoid the violation of the cluster assumption. Finally, we incorporate the margin discrepancy to measure the domain divergence for MDTC and fill the gap between MDTC algorithms and theories by deriving a new generalization bound based on the margin discrepancy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call