Improving Transition-Based Dependency Parsing of Hindi and Urdu by Modeling Syntactically Relevant Phenomena

Riyaz Ahmad Bhat,Irshad Ahmad Bhat,Dipti Misra Sharma

doi:10.1145/3005447

Abstract

In recent years, transition-based parsers have shown promise in terms of efficiency and accuracy. Though these parsers have been extensively explored for multiple Indian languages, there is still considerable scope for improvement by properly incorporating syntactically relevant information. In this article, we enhance transition-based parsing of Hindi and Urdu by redefining the features and feature extraction procedures that have been previously proposed in the parsing literature of Indian languages. We propose and empirically show that properly incorporating syntactically relevant information like case marking, complex predication and grammatical agreement in an arc-eager parsing model can significantly improve parsing accuracy. Our experiments show an absolute improvement of ∼2% LAS for parsing of both Hindi and Urdu over a competitive baseline which uses rich features like part-of-speech (POS) tags, chunk tags, cluster ids and lemmas. We also propose some heuristics to identify ezafe constructions in Urdu texts which show promising results in parsing these constructions.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Transition-Based Dependency Parsing of Hindi and Urdu by Modeling Syntactically Relevant Phenomena

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: Jan 20, 2017
Citations: 11

Similar Papers

Transition-based Neural Constituent Parsing
Taro Watanabe ... Eiichiro Sumita
-
Taro Watanabe, et. al.Taro Watanabe ... Eiichiro Sumita
01 Jan 2015
01 Jan 2015

Complex predicates in Indian languages and wordnets
Pushpak Bhattacharyya ... Debasri Chakrabarti
Language Resources and Evaluation | VOL. 40
Pushpak Bhattacharyya, et. al.Pushpak Bhattacharyya ... Debasri Chakrabarti
07 Sep 2007
Language Resources and Evaluation | VOL. 40

Development of Part-of-Speech tagger for a low-resource endangered language
Toshal Gore ... Vaibhav Khatavkar
-
Toshal Gore, et. al.Toshal Gore ... Vaibhav Khatavkar
16 Dec 2022
16 Dec 2022

Transition-based dependency parser with postponed determinations for Japanese sentences
Xiaobo Xi ... Akihiro Inokuchi
-
Xiaobo Xi, et. al.Xiaobo Xi ... Akihiro Inokuchi
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Transition-Based Dependency Parsing of Hindi and Urdu by Modeling Syntactically Relevant Phenomena

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing