What should be encoded by position embedding for neural network language models?

Shuiyuan Yu,Haitao Liu,Zihao Zhang

doi:10.1017/s1351324923000128

Abstract

AbstractWord order is one of the most important grammatical devices and the basis for language understanding. However, as one of the most popular NLP architectures, Transformer does not explicitly encode word order. A solution to this problem is to incorporate position information by means of position encoding/embedding (PE). Although a variety of methods of incorporating position information have been proposed, the NLP community is still in want of detailed statistical researches on position information in real-life language. In order to understand the influence of position information on the correlation between words in more detail, we investigated the factors that affect the frequency of words and word sequences in large corpora. Our results show that absolute position, relative position, being at one of the two ends of a sentence and sentence length all significantly affect the frequency of words and word sequences. Besides, we observed that the frequency distribution of word sequences over relative position carries valuable grammatical information. Our study suggests that in order to accurately capture word–word correlations, it is not enough to focus merely on absolute and relative position. Transformers should have access to more types of position-related information which may require improvements to the current architecture.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

What should be encoded by position embedding for neural network language models?

Abstract

Talk to us

Similar Papers

More From: Natural Language Engineering

Lead the way for us

Journal: Natural Language Engineering	Publication Date: May 10, 2023
License type: CC BY 4.0

Similar Papers

Assured relative and absolute navigation of a swarm of small UAS
Joel Huff ... Maarten Uijt De Haag
-
Joel Huff, et. al.Joel Huff ... Maarten Uijt De Haag
01 Sep 2017
01 Sep 2017

Working memory for patterned sequences of auditory objects in a songbird
Jordan A Comins ... Timothy Q Gentner
Cognition | VOL. 117
Jordan A Comins, et. al.Jordan A Comins ... Timothy Q Gentner
16 Jul 2010
Cognition | VOL. 117

On the Relation between Position Information and Sentence Length in Neural Machine Translation
Masato Neishi ... Naoki Yoshinaga
-
Masato Neishi, et. al.Masato Neishi ... Naoki Yoshinaga
01 Jan 2019
01 Jan 2019

Formation operations and navigation concept overview for the IRASSI space interferometer
Luisa Buinhas ... Thomas Pany
-
Luisa Buinhas, et. al.Luisa Buinhas ... Thomas Pany
01 Mar 2018
01 Mar 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

What should be encoded by position embedding for neural network language models?

Abstract

Talk to us

Similar Papers

More From: Natural Language Engineering