Abstract

In data-to-text Natural Language Generation (NLG) systems, computers need to find the right words to describe phenomena seen in the data. This paper focuses on the problem of choosing appropriate verbs to express the direction and magnitude of a percentage change (e.g., in stock prices). Rather than simply using the same verbs again and again, we present a principled data-driven approach to this problem based on Shannon’s noisy-channel model so as to bring variation and naturalness into the generated text. Our experiments on three large-scale real-world news corpora demonstrate that the proposed probabilistic model can be learned to accurately imitate human authors’ pattern of usage around verbs, outperforming the state-of-the-art method significantly.

Highlights

  • Natural Language Generation (NLG) is a fundamental task in Artificial Intelligence (AI) (Russell and Norvig, 2009)

  • Making use of probabilistic reasoning, the principled approach to handling uncertainties, we argue that the function f should be determined by the posterior probability P (w|x)

  • 6https://goo.gl/yyKBYa to express percentage changes with different directions and magnitudes. This model is not relying on hard-wired heuristics, but learned from training examples that are extracted from large-scale real-world news corpora

Read more

Summary

Introduction

Natural Language Generation (NLG) is a fundamental task in Artificial Intelligence (AI) (Russell and Norvig, 2009). It aims to automatically turn structured data into prose (Reiter, 2007; Belz and Kow, 2009) — the opposite of the better-known field of Natural Language Processing (NLP) that transforms raw text into structured data (e.g., a logical form or a knowledge base) (Jurafsky and Martin, 2009). We elect to use relative percentages rather than absolute numbers to describe the change from one data point to another. Given two data points (e.g., on a stock chart), the percentage change can always be calculated

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call