Abstract

In all languages, the ambiguities of simple forms pose a difficult problem in automatic text analysis. The data collected in the present work give a broad coverage of lexical ambiguities in French. They are obtained from the electronic dictionnary DELAF, which contains more than 650,000 forms. For the purpose of analysis, ambiguities are distributed into four large groups: ambiguities between two verbal forms, ambiguities between a verbal form and a non-verbal form, ambiguities between a noun and a non-verbal form, remaining ambiguities. Moreover, each group is divided into homogeneous subsets: for instance, one subset includes all forms which are ambiguous between a verb in the indicative present and a feminin noun. This classification into groups and subsets is presented with many illustrative examples. The inventory of all ambiguous French forms is an essential tool for the writing of algorithms needed for resolving ambiguities in text analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call