PARSING-DRIVEN GENERALIZATION FOR NATURAL LANGUAGE ACQUISITION

Rey-Long Liu,Von-Wun Soo

doi:10.1142/s0218001493000315

Abstract

Parsing is an important step in natural language processing. It involves tasks of searching for applicable grammatical rules which can transform natural language sentences into their corresponding parse trees. Therefore parsing can be viewed as problem solving, and language acquisition can be achieved by generalizing problem solving heuristics. In this paper we investigate how machine learning methodologies can be integrated with a Wait-And-See Parser (the problem solver) to acquire parsing-related knowledge that is needed for the parser. We call this approach parsing-driven generalization since learning (acquisition of parsing rules and classification of lexicons) is basically derived from the parsing process. Three types of generalization are reported in this paper: simple generalization, generalization by asking questions, and generalization back-propagation. Simple generalization generalizes any two parsing rules whose action parts (right-hand sides) are the same but whose condition parts (left-hand sides) have a single difference. Generalization by asking questions is triggered when a “climbing-up” move on a concept hierarchy is attempted. It is necessary for avoiding over-generalization. Generalization back-propagation propagates a confirmed generalization of some later parsing rule back to its precedent rules in a parsing sequence and thus causes them to be generalized as well. It can reduce the number of questions asked by the system. With these three types of generalization and a mechanism for maintaining lexicon classification (the domain concept hierarchy), parsing and learning can interact to utilize and acquire parsing-related knowledge. To promote the practical performance of parsing after learning, a relaxation parsing mechanism is also designed to process unseen sentences.

Full Text