Abstract

This contribution investigates novel techniques for error detection in automatic semantic annotations, as an attempt to reconcile error-prone NLP processing with high quality standards required for empirical research in Digital Humanities. We demonstrate the state-of-the-art performance of semantic NLP systems on a corpus of ritual texts and report performance gains we obtain using domain adaptation techniques. Our main contribution is to explore new techniques for annotation consistency control, as an attempt to reconcile error-prone NLP processing with high quality requirements. The novelty of our approach lies in its attempt to leverage multi-level semantic annotations by defining interaction constraints between local word-level semantic annotations and global discourse-level annotations. These constraints are defined using Markov Logic Networks, a logical formalism for statistical relational inference that allows for violable constraints. We report first results.

Highlights

  • The work described in this paper is embedded in an interdisciplinary project that aims at analyzing regularities and variances in the event structures of Nepalese rituals.1 The focus of this project is on investigating the event structure of rituals by applying computational linguistic analysis techniques to written descriptions of rituals.For scholars working in applied research in Digital Humanities, it is important that any evidence derived from computational analysis is accurate and reliable

  • As the semantic annotation of verbs will be mainly covered by FrameNet annotations, we report on the performance of WordNet sense disambiguation for nouns and adjectives, next to performance on all words

  • For frame-semantic annotation we could identify performance problems that can be addressed by retraining the semantic role labeling system on our semi-automatically annotated domain corpora, to the domain adaptation methods employed for preprocessing. (ii) To further reduce the gap between automatic annotation quality and the high quality standards required for empirical research in Digital Humanities, we investigated a novel approach to error detection using Markov Logic as formal framework

Read more

Summary

Introduction

The work described in this paper is embedded in an interdisciplinary project that aims at analyzing regularities and variances in the event structures of Nepalese rituals. The focus of this project is on investigating the event structure of rituals by applying computational linguistic analysis techniques to written descriptions of rituals. The work described in this paper is embedded in an interdisciplinary project that aims at analyzing regularities and variances in the event structures of Nepalese rituals.1 The focus of this project is on investigating the event structure of rituals by applying computational linguistic analysis techniques to written descriptions of rituals. The gap in performance between current system outputs and (near-to-)perfect annotation quality is still considerable In this contribution we investigate novel techniques for annotation error detection to guide manual annotation control or to acquire training material for domain adaptation.

Related Work
Corpus of ritual descriptions
NLP architecture
Word Sense Disambiguation
Frame-semantic labeling
Coreference Resolution
Exploiting Multiple Layers for Consistency Control
Experiments and Evaluation
Results
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.