Abstract

Sentence embedding, which aims to learn an effective representation of a sentence, is a significant part for downstream tasks. Recently, using contrastive learning and pre-trained model, most methods of sentence embedding achieve encouraging results. However, on the one hand, these methods utilize discrete data augmentation to obtain positive samples performing contrastive learning, which could distort the original semantic of sentences. On the other hand, most methods directly employ the contrastive frameworks of computer vision to perform contrastive learning, which could confine the contrastive training due to the discrete and sparse text data compared with image data. To solve the issues above, we design a novel contrastive framework based on generation model with multi-task learning by supervised contrastive training on the dataset of natural language inference (NLI) to obtain meaningful sentence embedding (SEBGM). SEBGM makes use of multi-task learning to enhance the usage of word-level and sentence-level semantic information of samples. In this way, the positive samples of SEBGM are from NLI rather than data augmentation. Extensive experiments show that our proposed SEBGM can advance the state-of-the-art sentence embedding on the semantic textual similarity (STS) tasks by utilizing multi-task learning.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.