Senseval: The CL Research Experience

Kenneth C Litkowski

doi:10.1023/a:1002463718479

Abstract

The CL Research Senseval system was the highest performing system among the "All- words" systems, with an overall fine-grained score of 61.6 percent for precision and 60.5 percent for recall on 98 percent of the 8,448 texts on the revised submission (up by almost 6 and 9 percent from the first). The results were achieved with an almost complete reliance on syntactic behavior, using (1) a robust and fast ATN-style parser producing parse trees with annotations on nodes, (2) DIMAP dictionary creation and maintenance software (after conversion of the Hector dictionary files) to hold dictionary entries, and (3) a strategy for analyzing the parse trees in concert with the dictionary data. Further considerable improvements are possible in the parser, exploitation of the Hector data (and representation of dictionary entries), and the analysis strategy, still with syntactic and collocational data. The Senseval data (the dictionary entries and the corpora) provide an excellent testbed for understanding the sources of failures and for evaluating changes in the CL Research system.

Full Text