Meaning representation and text planning

Christine Defrise,Sergei Nirenburg

doi:10.3115/991146.991185

Abstract

The data flow in natural language generation (NLG) starts with a 'world' state, represented by structures of an application program (e.g., an expert system) that has text generation needs and an impetus to produce a natural language text. The output of generation is a natural language text. The generation process involves the tasks of a) delimiting the content of the eventual text, b) plano ning its structure, c) selecting lexieal, syntactic and word order me,'ms of realizing this structure and d) actually realizing the textusing the latter. In advanced generation systems these processes are treated not in a monolithic way, but rather as components of a large, modular generator. NLG researchers experiment with various ways of delimiting the modules of the generation process and control architectures to drive these modules (see, for instance, McKeown, 1985, Hovy, 1987 or Meteer, 1989). But regardless of the decisions about general (intermodular) or local (intramodular) control flow, knowledge structures have to be defined to support processing and facilitate communication among the modules. The natural language generator DIOGENES(e.g., Nirenburg et al., 1989) has been originally designed for use in machine translation. This means that the content delimitation stage is unnecessary, as the set of meanings to be realized by the generator is obtained in machine translationas a result of source text analysis. The first processing component in DIOGENES is, therefore, its text planner which, takes as input a text meaning representation (TMR) and a set of static pragmatic factors (similar to Hovy's (1987) rhetorical goals) and produces a text plan (TP), a structure containing information about the order and boundaries of target language sentences; the decisions about reference realization and lexical selection, t At the next stage, a set of semantics-to-syntax mapping rules are used to produce a set of target-language syntactic structures (we are using the f-structures of LFG see, e.g., Nirenburg and Levin, 1989). Finally, a syntactic realizer produces a target language text from the set of f-structures. To produce texts of adequate quality, natural language generation needs a sufficiently expressive input language. In this paper we discuss several important aspects of the knowledge and the processing at the text planning stage of a generation system. First, we describe a comprehensive language processing paradigm which underlies work on both generation and analysis of natural language in our environment. Next, we illustrate the features of our meaning representation languages, the text meaning representation language TAMERLAN and the text plan representation language TPL. Finally, we describe the mechanism of text planning in DIOGENES and illustrate the formalism and the strategy for acquiring text planning rules.

Full Text