Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models

Keith Hall ,Yinfei Yang ,Daniel Cer ,Jianmo Ni ,Gustavo Hernandez Abrego ,Noah Constant ,Ji Ma

doi:10.48448/y8x0-9060

Abstract

We provide the first exploration of sentence embeddings from text-to-text transformers (T5) including the effects of scaling up sentence encoders to 11B parameters. Sentence embeddings are broadly useful for language processing tasks. While T5 achieves impressive performance on language tasks, it is unclear how to produce sentence embeddings from encoder-decoder models. We investigate three methods to construct SentenceT5 (ST5) models: two utilize only the T5 encoder and one using the full T5 encoderdecoder. We establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark (Wang et al., 2018). Our encoder-only models outperform the previous best models on both SentEval and SentGLUE transfer tasks, including semantic textual similarity (STS). Scaling up ST5 from millions to billions of parameters shown to consistently improve performance. Finally, our encoderdecoder method achieves a new state-of-theart on STS when using sentence embeddings.

Full Text