Abstract
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at training time, and often have difficulty recalling them. To address this, we introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. These mechanisms enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. We also introduce the Linked WikiText-2 dataset, a corpus of annotated text aligned to the Wikidata knowledge graph whose contents (roughly) match the popular WikiText-2 benchmark. In experiments, we demonstrate that the KGLM achieves significantly better performance than a strong baseline language model. We additionally compare different language model’s ability to complete sentences requiring factual knowledge, showing that the KGLM outperforms even very large language models in generating facts.
Highlights
For language models to generate plausible sentences, they must be both syntactically coherent as well as consistent with the world they describe
This is problematic for comparing the performance of the knowledge graph language model (KGLM) to traditional language models on Linked WikiText-2 since there are a large number of rare entities whose alias tokens are outof-vocabulary
Even if the KGLM identifies the correct entity and copies the correct alias token with high probability, other models can attain better perplexity by assigning a higher probability to UNK
Summary
For language models to generate plausible sentences, they must be both syntactically coherent as well as consistent with the world they describe. Language models are quite skilled at generating grammatical sentences, and previous work has shown that language models possess some degree of common-sense reasoning and basic knowledge (Vinyals and Le, 2015; Serban et al, 2016; Trinh and Le, 2019), their ability to generate factually correct text is quite limited. The clearest limitation of existing language models is that they, at best, can only memorize facts observed during. [Super Mario Land] is a [1989] [side-scrolling] [platform video game] developed and published by [Nintendo] as a [launch title] for their [Game Boy] [handheld game console].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.