Abstract

The main purpose of this paper is to make some linguistic contributions to an ISO initiative to formulate a standard on the semantic annotation framework of reference, coreference, and other types of anaphoric phenomena in natural language. For this purpose, we first make a brief review of some of the existing coreference annotation (CA) schemes. We then formulate an abstract syntax Asynana for anaphoric annotation on which a variety of concrete syntaxes such as an XML-based concrete syntax CsynanaX can be developed to provide an interoperable representation format for the annotation. To satisfy the semantic adequacy of the proposed abstract syntax even partially, we check the possibility of developing formal semantics based on it. Such a semantics may be accepted as a valid application of the proposed annotation scheme. We finally consider the multilingual applicability of proposed ASynana by applying it to Korean, a non-inflectional agglutinating language with pro-drop properties.

Highlights

  • Datasets are important for any linguistic analysis at least in two respects.(1) a

  • 4 [70] calls this a case of split antecedent, whereas [78] views this as an instance of multiple antecedent. While referring to these works and others to be cited, this paper aims at constructing a semantic annotation scheme (AS) for coreference and other anaphoric link phenomena in a language (English) that may be proposed as an ISO standard for language resources management

  • This paper is an extended and revised version of [54]. It mainly aimed at making some linguistic contributions as necessary ground work to an ISO initiative to produce an international standard on anaphoric annotation as part of a series of semantic annotation schemes

Read more

Summary

Keywords Abstract

Anaphoric Link, Annotation, Concrete Syntax, Coreference, Formal Semantics, Semantic Annotation rean. The English datasets are used to carry the general discussion of anaphoric links in language, whereas the Korean datasets are used to check how it applies to non-European languages like Korean, a non-inflectional language known as an agglutinating language with pro-drop properties. After {hermt deathmt11}mt12, {the professor}mt and {theirmt only daughtermt15}mt found {a large sum of money}mt left for themmt to share. Jonesmt , wanted to remarry and found {a charming young Vietnamese womanmt20}mt, aged 24, 40 years younger than hismt own daughtermt. If shemt agreed to marry himmt, shemt would receive {eighty million pounds}mt29 – {a million pounds}mt a yearmt31 – right after theirmt marriagemt and inherit the restmt after hismt deathmt. The ladymt refused hismt tempting offermt, instead marrying {a young British bandmastermt40}mt with {no promising careermt42}mt

Introduction
Preliminaries
Overview
Coindexing
The MUC-7 Coreference Task Definition
Extents
Coreference Links
Illustrations
The MMAX2 Multi-level Annotation Scheme
Markables
Anaphoric Links
Example
Attribute-Value Specifications
The Brandeis ISO-Space Annotation Guidelines
Identifying Markables and Extents
Anaphor-antecedent Pairs
Formal Description
A Sketch of Abstract Syntax
General Abstract Syntax
General
XML-based Concrete Syntax
Semantic Interpretations: A Sketch
Datasets
Pragmatic Features Associated with Pronouns in Korean
Discourse Situation as Part of Metadata
Findings
Concluding Remarks: A Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call