Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness

Mengting Luo,Fei Han,Dejun Zhang,Long Tian,Linchao He,Haibo Pu,Mingyue Guo

doi:10.1088/1757-899x/533/1/012051

Abstract

Nearest neighbor search is playing a critical role in machine word translation, due to its ability to obtain the lingual labels of source word embeddings by searching k Nearest Neighbor ( k NN) target embeddings from a shared bilingual semantic space. However, aligning two language distributions into a shared space usually requires amounts of target label, and k NN retrieval causes hubness problem in high-dimensions feature space. Although most the best-k retrievals get rid of hubs in the list of translation candidates to mitigate the hubness problem, it is flawed to eliminate hubs. Because hub also has a correct source word query corresponding to it and should not be crudely excluded. In this paper, we introduce an unsupervised machine word translation model based on Generative Adversarial Nets (GANs) with Bilingual Similarity retrieval, namely, Unsupervised-BSMWT. Our model addresses three main challenges: (1) reduce the dependence of parallel data with GANs in a fully unsupervised way. (2) Significantly decrease the training time of adversarial game. (3) Propose a novel Bilingual Similarity retrieval for mitigating hubness pollution regardless of whether it is a hub. Our model efficiently performs competitive results in 74min exceeding previous GANs-based models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Materials Science and Engineering

Lead the way for us

Journal: IOP Conference Series: Materials Science and Engineering	Publication Date: May 1, 2019
License type: cc-by

Similar Papers

Reconstructed similarity for faster GANs-based word translation to mitigate hubness
Dejun Zhang ... Fazhi He
Neurocomputing | VOL. 362
Dejun Zhang, et. al.Dejun Zhang ... Fazhi He
19 Jul 2019
Neurocomputing | VOL. 362

Approximate nearest neighbor search on HDD
Noritaka Himei ... Toshikazu Wada
-
Noritaka Himei, et. al.Noritaka Himei ... Toshikazu Wada
01 Sep 2009
01 Sep 2009

Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
Shaohui Kuang ... Junhui Li
-
Shaohui Kuang, et. al.Shaohui Kuang ... Junhui Li
01 Jan 2018
01 Jan 2018

Approximate k-NN Graph Construction: A Generic Online Approach
Wan-Lei Zhao ... Chong-Wah Ngo
IEEE Transactions on Multimedia | VOL. 24
Wan-Lei Zhao, et. al.Wan-Lei Zhao ... Chong-Wah Ngo
12 Mar 2021
IEEE Transactions on Multimedia | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Materials Science and Engineering