Abstract

This paper proposes an additional layer of annotation for the recently established Hindi/Urdu Treebank. Despite the fact that the treebank already features a number of annotation layers such as phrase structure, dependency relations and predicate-argument structure, we see potential for the inclusion of a dependency layer generated from Lexical-Functional Grammar (LFG) f-structures with relations that we believe are crucial for a deep analysis of Urdu/Hindi. The suggestions are based on theoretical and computational investigations into Urdu/Hindi in the context of the Urdu ParGram grammar, through which we can automatically create the additional annotation layer.

Highlights

  • Many statistical natural language processing applications rely on treebanks where syntactic and semantic patterns of language are annotated in order to provide a relatively comprehensive sample of linguistic constructions with a theoretically well-founded representation of these constructions

  • The Hindi/Urdu Treebank (Palmer et al, 2007, Bhatt et al, 2009) is a novel attempt to create a multi-layered treebank for Indo-Aryan languages; it features different annotation levels, namely a phrase structure annotation inspired by the Chomskyan approach to syntax (Chomsky, 1981, 1995), a level of dependency annotation following the Computational Pan. inian Grammar (Bharati et al, 1995, Begum et al, 2008) as well as the marking of predicate-argument structure in the PropBank style (Palmer et al, 2005)

  • We propose an additional layer of dependency annotation for the hutb

Read more

Summary

Introduction

Many statistical natural language processing (nlp) applications rely on treebanks where syntactic and semantic patterns of language are annotated in order to provide a relatively comprehensive sample of linguistic constructions with a theoretically well-founded representation of these constructions. The dependency annotation mainly expresses verb-centric relations as developed by Pan. ini, i.e. the relation of arguments with respect to a given verb. These relations can be divided into kar. The proposed layer of annotation is presented, in particular we will discuss the dependency annotation of linguistic phenomena such as modality and tense/aspect (in 3.1 and 3.2, respectively).

Ingredients
Lexical-Functional Grammar
An additional dependency annotation for the hutb
Modality
41 ADV-TYPE vpadv
Multiword entities
Ambiguity management
Summary and outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call