Conditions on consistency of probabilistic Tree Adjoining Grammars

Anoop Sarkar

doi:10.3115/980432.980759

Abstract

Much of the power of probabilistic methods in modelling language comes from their ability to compare several derivations for the same string in the language. An important starting point for the study of such cross-derivational properties is the notion of _consistency_. The probability model defined by a probabilistic grammar is said to be _consistent_ if the probabilities assigned to all the strings in the language sum to one. From the literature on probabilistic context-free grammars (CFGs), we know precisely the conditions which ensure that consistency is true for a given CFG. This paper derives the conditions under which a given probabilistic Tree Adjoining Grammar (TAG) can be shown to be consistent. It gives a simple algorithm for checking consistency and gives the formal justification for its correctness. The conditions derived here can be used to ensure that probability models that use TAGs can be checked for _deficiency_ (i.e. whether any probability mass is assigned to strings that cannot be generated).

Full Text