Multi-Group Transfer Learning on Multiple Latent Spaces for Text Classification

Jianhan Pan,Xiaomei Li,Teng Cui,Jing Zhang,Thuc Duy Le

doi:10.1109/access.2020.2984571

Abstract

Transfer learning aims to leverage valuable information in one domain to promote the learning tasks in the other domain. Some recent studies indicated that the latent information, which has a close relationship with the high-level concepts, are more suitable for cross-domain text classification than learning raw features. To obtain more latent information existing in the latent feature space, some previous methods constructed multiple latent feature spaces. However, those methods ignored that the latent information of different latent spaces may lack the relevance for promoting the adaptability of transfer learning models, even may lead to negative knowledge transfer when there exists a glaring discrepancy among the different latent spaces. Additionally, since those methods learn the latent space distributions using a strategy of direct-promotion, their computational complexity increases exponentially as the number of latent spaces increases. To tackle this challenge, this paper proposes a Multiple Groups Transfer Learning (MGTL) method. MGTL first constructs multiple different latent feature spaces and then integrates the adjacent ones that have a similar latent feature dimension into one latent space group. Along this way, multiple latent space groups can be obtained. To enhance the relevance among these latent space groups, MGTL makes the adjacent groups contain one same latent space at least. Then, different groups will have more relevance than raw latent spaces. Second, MGTL utilizes an indirect-promotion strategy to connect different latent space groups. The computational complexity of MGTL increases linearly as the number of latent space groups increases and is superior to those multiple latent space methods based on direct-promotion. In addition, an iterative algorithm is proposed to solve the optimization problem. Finally, a set of systematic experiments demonstrate that MGTL outperforms all the compared existing methods.

Highlights

Traditional classification algorithms can achieve satisfying performance since they have a common assumption that both training and test data come from the same distribution
We propose Multi-Group Transfer Learning (MGTL) based on non-negative matrix tri-factorization (NMTF) techniques, which groups multiple latent feature spaces and learns the corresponding distributions in the different latent space groups simultaneously
We find that Multiple Groups Transfer Learning (MGTL) and MGTL-Direct outperform all the single latent space group approaches on all the tasks, which means that the strategy of MGTL can improve the performance of classification effectively

Summary

INTRODUCTION

Traditional classification algorithms can achieve satisfying performance since they have a common assumption that both training and test data come from the same distribution. The key idea of MGTL is as follows: First, to obtain more latent information that can be used to learn the shared structure across domains, MGTL constructs multiple different latent feature spaces and integrates the adjacent latent spaces that have the similar latent feature dimension into one latent space group. To decrease the computational complexity of learning multiple latent space groups, MGTL utilizes an indirect-promotion strategy to connect different latent space groups [35] It exploits the label information in the source domains and the latent shared information on one latent space group to learn the corresponding distributions. The main contributions of this paper are three-fold: 1) Motivated by a significant observation that different latent spaces may not promote each other effectively to build a shared bridge, we propose a novel method MGTL which can construct multiple relevant groups for knowledge transfer.

RELATED WORK

COMPUTATIONAL COMPLEXITY OF THE

EXPERIMENTAL EVALUATION

CONCLUSION