Evaluating Commonsense in Pre-Trained Language Models

Xuhui Zhou,Leyang Cui,Dandan Huang,Yue Zhang

doi:10.1609/aaai.v34i05.6523

Abstract

Contextualized representations trained over large raw text data have given remarkable improvements for NLP tasks including question answering and reading comprehension. There have been works showing that syntactic, semantic and word sense knowledge are contained in such representations, which explains why they benefit such tasks. However, relatively little work has been done investigating commonsense knowledge contained in contextualized representations, which is crucial for human question answering and reading comprehension. We study the commonsense ability of GPT, BERT, XLNet, and RoBERTa by testing them on seven challenging benchmarks, finding that language modeling and its variants are effective objectives for promoting models' commonsense ability while bi-directional context and larger training set are bonuses. We additionally find that current models do poorly on tasks require more necessary inference steps. Finally, we test the robustness of models by making dual test cases, which are correlated so that the correct prediction of one sample should lead to correct prediction of the other. Interestingly, the models show confusion on these test cases, which suggests that they learn commonsense at the surface rather than the deep level. We release a test set, named CATs publicly, for future research.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Commonsense in Pre-Trained Language Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 49

Similar Papers

Considering Commonsense in Solving QA: Reading Comprehension with Semantic Search and Continual Learning
Seungwon Jeong ... Dongsuk Oh
Applied Sciences | VOL. 12
Seungwon Jeong, et. al.Seungwon Jeong ... Dongsuk Oh
19 Apr 2022
Applied Sciences | VOL. 12

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Tran Vu Pham
-
An Pha Le, et. al.An Pha Le ... Tran Vu Pham
16 Dec 2021
16 Dec 2021

On the Power of Pre-Trained Text Representations
Yu Meng ... Jiawei Han
-
Yu Meng, et. al.Yu Meng ... Jiawei Han
14 Aug 2021
14 Aug 2021

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Commonsense in Pre-Trained Language Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence