Abstract

We present a new approach for generating role-labeled training data using Linked Lexical Resources, i.e., integrated lexical resources that combine several resources (e.g., Word-Net, FrameNet, Wiktionary) by linking them on the sense or on the role level. Unlike resource-based supervision in relation extraction, we focus on complex linguistic annotations, more specifically FrameNet senses and roles. The automatically labeled training data ( www.ukp.tu-darmstadt.de/knowledge-based-srl/ ) are evaluated on four corpora from different domains for the tasks of word sense disambiguation and semantic role classification. Results show that classifiers trained on our generated data equal those resulting from a standard supervised setting.

Highlights

  • In this work, we present a novel approach to automatically generate training data for semantic role labeling (SRL)

  • Our novel approach to training data generation for FrameNet SRL uses the paradigm of distant supervision (Mintz et al, 2009) which has become popular in relation extraction

  • We show that discriminating patterns can improve the quality of the automatic sense labels. (ii) We use a distant supervision approach – building on lexical resources (LLRs) – to address the complex problem of training data generation for FrameNet role labeling, which builds upon the sense labeling in (i). (iii) Our detailed evaluation and analysis show that our approach for data generation is able to generalize across domains and languages

Read more

Summary

Introduction

We present a novel approach to automatically generate training data for semantic role labeling (SRL). It follows the distant supervision paradigm and performs knowledge-based label transfer from rich external knowledge sources to large corpora. Even though unsupervised approaches continue to gain popularity, SRL is typically still solved using supervised training on labeled data. Creating such labeled data requires manual annotations by experts,. Our novel approach to training data generation for FrameNet SRL uses the paradigm of distant supervision (Mintz et al, 2009) which has become popular in relation extraction. A particular type of knowledge base relevant for distant supervision are linked lexical resources (LLRs): integrated lexical resources that combine several resources (e.g., WordNet, FrameNet, Wiktionary) by linking them on the sense or on the role level

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.