Abstract

Initial experiments in learning valence (subcategorisation) frames of Polish verbs from a morphosyntactically annotated corpus are reported here. The learning algorithm consists of a linguistic module, responsible for very simple shallow parsing of the input text (nominal and prepositional phrase recognition) and for the identification of valence frame cues (hypotheses), and a statistical module which implements three well-known inferential statistics (likelihood ratio, t test, binomial miscue probability test). The results of the three statistics are evaluated and compared with a baseline approach of selecting frames on the basis of the relative frequencies of frame/verb co-occurrences. The results, while clearly reflecting the many deficiencies of the linguistic analysis and the inadequacy of the statistical measures employed here for a free word order language rich in ellipsis and morphosyntactic syncretisms, are nevertheless promising.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.