Abstract

We present a brief history and overview of statistical methods in frame-semantic parsing – the automatic analysis of text using the theory of frame semantics. We discuss how the FrameNet lexicon and frameannotated datasets have been used by statistical NLP researchers to build usable, state-of-the-art systems. We also focus on future directions in frame-semantic parsing research, and discuss NLP applications that could benefit from this line of work. 1 Frame-Semantic Parsing Frame-semantic parsing has been considered as the task of automatically finding semantically salient targets in text, disambiguating their semantic frame representing an event and scenario in discourse, and annotating arguments consisting of words or phrases in text with various frame elements (or roles). The FrameNet lexicon (Baker et al., 1998), an ontology inspired by the theory of frame semantics (Fillmore, 1982), serves as a repository of semantic frames and their roles. Figure 1 depicts a sentence with three evoked frames for the targets “million”, “created” and “pushed” with FrameNet frames and roles. Automatic analysis of text using framesemantic structures can be traced back to the pioneering work of Gildea and Jurafsky (2002). Although their experimental setup relied on a primitive version of FrameNet and only made use of “exemplars” or example usages of semantic frames (containing one target per sentence) as opposed to a “corpus” of sentences, it resulted in a flurry of work in the area of automatic semantic role labeling (Marquez et al., 2008). However, the focus of semantic role labeling (SRL) research has mostly been on PropBank (Palmer et al., 2005) conventions, where verbal targets could evoke a “sense” frame, which is not shared across targets, making the frame disambiguation setup different from the representation in FrameNet. Furthermore, it is fair to say that early research on PropBank focused primarily on argument structure prediction, and the interaction between frame and argument structure analysis has mostly been unaddressed (Marquez et al., 2008). There are exceptions, where the verb frame has been taken into account during SRL (Meza-Ruiz and Riedel, 2009; Watanabe et al., 2010). Moreoever, the CoNLL 2008 and 2009 shared tasks also include the verb and noun frame identification task in their evaluations, although the overall goal was to predict semantic dependencies based on PropBank, and not full argument spans (Surdeanu et al., 2008; Hajic et al., 2009). The SemEval 2007 shared task (Baker et al., 2007) attempted to revisit the frame-semantic analysis task based on FrameNet. It introduced a larger FrameNet lexicon (version 1.3), and also a larger corpus with full-text annotations compared to prior work, with multiple targets annotated per sentence. The corpus allowed words and phrases with noun, verb, adjective, adverb, number, determiner, conjunction and preposition syntactic categories to serve as targets and evoke frames, unlike any other single dataset; it also allowed targets from different syntactic categories share frames, and therefore roles. The repository of semantic role types was also much richer than PropBankstyle lexicons, numbering in several hundreds. Most systems participating in the task resorted to a cascade of classifiers and rule-based modules: identifying targets (a non-trivial subtask), disambiguating frames, identifying potential arguments, and then labeling them with roles. The system described by Johansson and Nugues (2007) performed the best in this shared task. Next, we focus on its performance, and subsequent improvements made by the research community on this task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.