Sur la confection d'un lexique pour l'analyse automatique

Morris Salkoff

doi:10.1007/bf02400104

Abstract

The goodness of the results to be obtained from a system for syntactic analysis of a natural language by computer (with French as an example) depends crucially on the way the lexicon is constructed. To avoid incorrect parses of a correct sentence, verbal selection rules are required. These rules are just the acceptability constraints observed between verb and subject and between verb and object in sentences of the formN1tV N2 andN1tV N2P N3. It is well known that not every noun subclass is an acceptable subject (or object) for every verb in such sentences; those that are constitute the selection of the verb. By a careful definition of each subclass of nouns, verbs, adjectives, etc., these rules can be incorporated into Harris' string grammar, which is used here. The definitions should, insofar as possible, be based on formal distributional criteria, so that the classification will be independent of intuitive judgments. The main conclusions are that both the lexicon and the grammar must be very detailed if one wishes to eliminate all incorrect parses; that no automatic word classification is possible, for example, via a dialogue with the computer; and finally, that the computer can be used fruitfully to investigate distributional phenomena in as-yet-unexplored subdomains, for example, by analyzing scientific text.

Full Text