Abstract

Language usage can change across periods of time, but document classifiers models are usually trained and tested on corpora spanning multiple years without considering temporal variations. This paper describes two complementary ways to adapt classifiers to shifts across time. First, we show that diachronic word embeddings, which were originally developed to study language change, can also improve document classification, and we show a simple method for constructing this type of embedding. Second, we propose a time-driven neural classification model inspired by methods for domain adaptation. Experiments on six corpora show how these methods can make classifiers more robust over time.

Highlights

  • Language changes and varies over time, which can cause a degradation of performance in natural language processing models over time

  • Recent research has shown that document classifiers can become more stable over time when trained in ways that account for temporal variations (Huang and Paul, 2018; He et al, 2018)

  • Recent research has used diachronic word embeddings to study how language changes over time (Kulkarni et al, 2015; Hamilton et al, 2016; Kutuzov et al, 2018)

Read more

Summary

Introduction

Language changes and varies over time, which can cause a degradation of performance in natural language processing models over time. Recent research has shown that document classifiers can become more stable over time when trained in ways that account for temporal variations (Huang and Paul, 2018; He et al, 2018). We refer to this task of accounting for such variations during training as temporality adaptation. Recent research has used diachronic word embeddings to study how language changes over time (Kulkarni et al, 2015; Hamilton et al, 2016; Kutuzov et al, 2018) These studies have shown that shifts in the corpora across time cause changes in word contexts and changes in the learned representations

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call