TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Siddhant Garg,Thuy Vu,Alessandro Moschitti

doi:10.1609/aaai.v34i05.6282

Abstract

We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale dataset to enable the transfer step, exploiting the Natural Questions dataset. Our approach establishes the state of the art on two well-known benchmarks, WikiQA and TREC-QA, achieving the impressive MAP scores of 92% and 94.3%, respectively, which largely outperform the the highest scores of 83.4% and 87.5% of previous work. We empirically show that TandA generates more stable and robust models reducing the effort required for selecting optimal hyper-parameters. Additionally, we show that the transfer step of TandA makes the adaptation step more robust to noise. This enables a more effective use of noisy datasets for fine-tuning. Finally, we also confirm the positive impact of TandA in an industrial setting, using domain specific datasets subject to different types of noise.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 117

Similar Papers

Study on Image De-Noising and Technique
Archana Barthwal ... Sumita Lamba
SSRN | VOL. -
Archana Barthwal, et. al.Archana Barthwal ... Sumita Lamba
15 Mar 2019
SSRN | VOL. -

Kolmogorov-type systems with regime-switching jump diffusion perturbations
Fuke Wu ... Zhuo Jin
Discrete & Continuous Dynamical Systems - B | VOL. 21
Fuke Wu, et. al.Fuke Wu ... Zhuo Jin
01 Aug 2016
Discrete & Continuous Dynamical Systems - B | VOL. 21

Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks.
Ling Luo ... Yuqi Liu
Journal of the American Medical Informatics Association : JAMIA | VOL. -
Ling Luo, et. al.Ling Luo ... Yuqi Liu
29 Feb 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. -

Особливості отримання аерокосмічних зображень, їх оброблення та оцінювання шуму
N B Shakhovska ... O I Kosar
Scientific Bulletin of UNFU | VOL. 29
N B Shakhovska, et. al.N B Shakhovska ... O I Kosar
27 Jun 2019
Scientific Bulletin of UNFU | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence