Abstract

In this report we summarize the results of the SemEval 2016 Task 8: Meaning Representation Parsing. Participants were asked to generate Abstract Meaning Representation (AMR) (Banarescu et al., 2013) graphs for a set of English sentences in the news and discussion forum domains. Eleven sites submitted valid systems. The availability of state-of-the-art baseline systems was a key factor in lowering the bar to entry; many submissions relied on CAMR (Wang et al., 2015b; Wang et al., 2015a) as a baseline system and added extensions to it to improve scores. The evaluation set was quite difficult to parse, particularly due to creative approaches to word representation in the web forum portion. The top scoring systems scored 0.62 F1 according to the Smatch (Cai and Knight, 2013) evaluation heuristic. We show some sample sentences along with a comparison of system parses and perform quantitative ablative studies.

Highlights

  • Meaning Representation (AMR) is a compact, readable, whole-sentence semantic annotation (Banarescu et al, 2013)

  • The Abstract Meaning Representation (AMR) in this corpus have changed somewhat from their counterparts in LDC2014E41, consistent with the evolution of the AMR standard. They contain wikification via the :wiki attribute, they use new PropBank framesets that are unified across parts of speech, they have been deepened in a number of ways, and various corrections have been applied

  • 6.1.1 Brandeis / cemantix.org / RPI (Wang et al, 2016). This team, the originators of CAMR, started with their existing AMR parser and experimented with three sets of new features: 1) rich named entities, 2) a verbalization list, and 3) semantic role labels. They used the RPI Wikifier to wikify the concepts in the AMR graph

Read more

Summary

Introduction

Abstract Meaning Representation (AMR) is a compact, readable, whole-sentence semantic annotation (Banarescu et al, 2013). Several parsers were released in the past couple of years (Flanigan et al, 2014; Wang et al, 2015b; Werling et al, 2015; Wang et al, 2015a; Artzi et al, 2015; Pust et al, 2015) This body of work constitutes many diverse and interesting scientific contributions, but it is difficult to adequately determine which parser is numerically superior, due to heterogeneous evaluation decisions and the lack of a controlled blind evaluation. The purpose of this task, was to provide a competitive environment in which to determine one winner and award a trophy to said winner

Training Data
Other Resources
Evaluation Data
Task Definition
Participants and Results
CAMR-based systems
Other Approaches
Impact of Wikification
Performance on different parts of the AMR
Performance on different data sources
Easiest Sentences
Hardest Sentences
There Can Be Only One?
10 Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call