Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning

Yutao Xie,Wei Chen,Tengjiao Wang,Qiyu Wu

doi:10.1109/taslp.2022.3203209

Abstract

Learning semantic sentence embeddings is beneficial to a variety of natural language processing tasks. Recently, methods using the contrastive learning framework to fine-tune pre-trained language models have been proposed and have achieved significant performance on sentence embeddings. However, sentence embeddings are easy to “overfit” to the contrastive learning goal. With the training of contrastive learning, the gap between contrastive learning and test tasks leads to unstable even declining performance on test tasks. For this reason, existing methods rely on the labeled development set to frequently evaluate the performance on test tasks and get the best checkpoints. In such a way, models are limited when the labeled data is unavailable or extremely scarce. To address this problem, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">P</i> seudo- <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</i> iamese network <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</i> utual <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">L</i> earning (PSML) for self-supervised sentence embeddings to reduce the gap between contrastive learning and test tasks. Consisting of the main encoder and the auxiliary encoder, PSML utilizes mutual learning as the basic framework. Between the two encoders, two mutual learning losses are constructed to share learning signals. The proposed model framework and losses of PSML help the model be optimized more stably and generalize better to test tasks, such as semantic textual similarity. Extensive experiments on seven public semantic textual similarity datasets show that PSML performs better than previous unsupervised contrastive methods for sentence embeddings. Besides, PSML also gives a stable performance curve on test tasks with training and is able to get the comparative performance without frequent evaluation on the labeled development set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2022
Citations: 2

Similar Papers

Extracting Sentence Embeddings from Pretrained Transformer Models
Lukas Stankevičius ... Mantas Lukoševičius
Applied Sciences | VOL. 14
Lukas Stankevičius, et. al.Lukas Stankevičius ... Mantas Lukoševičius
02 Oct 2024
Applied Sciences | VOL. 14

SupMPN: Supervised Multiple Positives and Negatives Contrastive Learning Model for Semantic Textual Similarity
Somaiyeh Dehghan ... Mehmet Fatih Amasyali
Applied Sciences | VOL. 12
Somaiyeh Dehghan, et. al.Somaiyeh Dehghan ... Mehmet Fatih Amasyali
26 Sep 2022
Applied Sciences | VOL. 12

On the Sentence Embeddings from Pre-trained Language Models
Bohan Li ... Junxian He
-
Bohan Li, et. al.Bohan Li ... Junxian He
01 Jan 2020
01 Jan 2020

SEBGM: Sentence Embedding Based on Generation Model with multi-task learning
Qian Wang ... Xu Wang
Computer Speech & Language | VOL. 87
Qian Wang, et. al.Qian Wang ... Xu Wang
06 Apr 2024
Computer Speech & Language | VOL. 87

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing