Unsupervised Cross-Lingual Sentence Representation Learning via Linguistic Isomorphism

Shuai Wang,Meihan Tong,Lei Hou

doi:10.1007/978-3-030-29563-9_20

Abstract

Recently, many researches on learning cross-lingual word embeddings without parallel data have achieved success by utilizing word isomorphism among languages. However, unsupervised cross-lingual sentence representation, which aims to learn a unified semantic space without parallel data, has not been well explored. Though many cross-lingual tasks can be solved by learning a unified sentence representation of different languages benefiting from cross-lingual word embeddings, the performance is not competitive with their supervised counterparts. In this paper, we propose a novel framework for unsupervised cross-lingual sentence representation learning by utilizing linguistic isomorphism in both word and sentence level. After generating pseudo-parallel sentence based on the pre-trained cross-lingual word embeddings, the framework iteratively conducts sentence modeling, word embedding tuning and parallel sentences update. Our experiments show that the proposed framework achieves state-of-the-art results in many cross-lingual tasks, as well as improves the quality of cross-lingual word embeddings. The codes and pre-trained encoders will be released upon the paper publishing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised Cross-Lingual Sentence Representation Learning via Linguistic Isomorphism

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Shapelet-Based Framework for Unsupervised Multivariate Time Series Representation Learning
Zhiyu Liang ... Jianfeng Zhang
Proceedings of the VLDB Endowment | VOL. 17
Zhiyu Liang, et. al.Zhiyu Liang ... Jianfeng Zhang
01 Nov 2023
Proceedings of the VLDB Endowment | VOL. 17

Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning
Wenshen Xu ... Shuangyin Li
Applied Intelligence | VOL. 51
Wenshen Xu, et. al.Wenshen Xu ... Shuangyin Li
14 Nov 2020
Applied Intelligence | VOL. 51

A study on the innovative model of foreign language teaching in universities using big data corpus
Ying Zhao ... Genshun Liang
Journal of Computational Methods in Sciences and Engineering | VOL. 24
Ying Zhao, et. al.Ying Zhao ... Genshun Liang
14 Mar 2024
Journal of Computational Methods in Sciences and Engineering | VOL. 24

GRLC: Graph Representation Learning With Constraints.
Liang Peng ... Xiaoxiao Li
IEEE transactions on neural networks and learning systems | VOL. 35
Liang Peng, et. al.Liang Peng ... Xiaoxiao Li
01 Jun 2024
IEEE transactions on neural networks and learning systems | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Cross-Lingual Sentence Representation Learning via Linguistic Isomorphism

Abstract

Talk to us

Similar Papers