Abstract

We created a supertagger for the Spanish language aimed at disambiguating the HPSG lexical frames for the verbs, nouns and adjectives in a sentence. The supertagger uses a maximum entropy model and achieves an accuracy of 84.16% over the verb classes, 86.60% over the noun classes and 91.30% over the adjective classes on the test set. The tagset contains 92 verb classes, 27 noun classes and 13 adjective classes extracted from a Spanish HPSG-compatible annotated corpus that was created by automatically transforming the AnCora Spanish corpus. The tags include information about the arguments structure, their syntactic categories and semantic roles. These are important pieces of HPSG style feature structures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call