Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?

Thang Pham,Anh Nguyen,Trung Bui,Long Mai

doi:10.18653/v1/2021.findings-acl.98

Abstract

Do state-of-the-art natural language understanding models care about word order - one of the most important characteristics of a sequence? Not always! We found 75% to 90% of the correct predictions of BERT-based classifiers, trained on many GLUE tasks, remain constant after input words are randomly shuffled. Despite BERT embeddings are famously contextual, the contribution of each individual word to downstream tasks is almost unchanged even after the word's context is shuffled. BERT-based models are able to exploit superficial cues (e.g. the sentiment of keywords in sentiment analysis; or the word-wise similarity between sequence-pair inputs in natural language inference) to make correct decisions when tokens are arranged in random orders. Encouraging classifiers to capture word order information improves the performance on most GLUE tasks, SQuAD 2.0 and out-of-samples. Our work suggests that many GLUE tasks are not challenging machines to understand the meaning of a sentence.

Highlights

Machine learning (ML) models recently achieved excellent performance on state-of-the-art benchmarks for evaluating natural language understanding (NLU)
We chose to answer this question for SST-2 and QNLI because they have the lowest Word-Order Sensitivity score (WOS) scores across all 6 GLUE tasks tested (Table 2) and they are representative of single-sentence and sequencepair tasks, respectively
After the second finetuning on downstream tasks, we observed that all models were substantially more sensitive to word order, compared to the baseline models

Summary

Introduction

Machine learning (ML) models recently achieved excellent performance on state-of-the-art benchmarks for evaluating natural language understanding (NLU). In July 2019, RoBERTa (Liu et al, 2019) was the first to surpass a human baseline on GLUE (Wang et al, 2019). 13 more methods have outperformed humans on the GLUE leaderboard. At least 8 out of the 14 solutions are based on BERT (Devlin et al, 2019)—a transformer architecture that learns representations via a bidirectional encoder. Given their superhuman GLUE-scores, how do BERT-based models solve NLU tasks? We shed light into these important questions by examining model sensitivity to the order of words. Word order is one of the key characteristics of a

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 21	License type: cc-by

Similar Papers

Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?
...
-
, et. al. ...
01 Aug 2021
01 Aug 2021

Framework for Deep Learning-Based Language Models Using Multi-Task Learning in Natural Language Understanding: A Systematic Literature Review and Future Directions
Rahul Manohar Samant ... Shilpa Gite
IEEE Access | VOL. 10
Rahul Manohar Samant, et. al.Rahul Manohar Samant ... Shilpa Gite
01 Jan 2021
IEEE Access | VOL. 10

Conditional Logic C b and Its Tableau System
Yuri Ozaki ... Daisuke Bekki
-
Yuri Ozaki, et. al.Yuri Ozaki ... Daisuke Bekki
01 Jan 2010
01 Jan 2010

DG‐based SPO tuple recognition using self‐attention M‐Bi‐LSTM
Joon‐Young Jung
ETRI Journal | VOL. 44
Joon‐Young JungJoon‐Young Jung
29 Nov 2021
ETRI Journal | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?

Abstract

Highlights

Summary

Talk to us

Similar Papers