Abstract
We present a detailed error analysis of a transition-based dependency parser trained on a Hindi dependency treebank. Parser error analysis has not been systematically examined from the point of view of treebanking before and this work intends to contribute in this area.
 We address two main questions in this paper:
 
 Can the parsing of certain structures be made easier by using alternative analyses for these structures?
 Are there certain linguistic cues implicit (or missing) in the current treebank that can be made explicit (or added) in order to make the parsing of complex constructions easier?
 
 These questions will guide us in examining the potential benefits of parser error analysis during treebanking. Through our experiments and analysis we were able to shed light on the causes of errors and subsequently have been able to improve the performance of the parser.
Highlights
Since the availability of Penn Treebank (Marcus et al, 1993), treebanks have played a crucial role in our attempt to build automatic natural language processing tools for various languages
The error analysis helps us formulate the questions that we address in this work
The results show that the lexical information for conjunctions in itself is sucient to disambiguate the coordination vs. subordination structures correctly and the added valency information seems to be redundant
Summary
Since the availability of Penn Treebank (Marcus et al, 1993), treebanks have played a crucial role in our attempt to build automatic natural language processing tools for various languages. While the analysis of parser errors is used to improve parser performance (by discovering new learning features or re-designing parsing algorithms), it is rarely used to inform guidelines decisions. We carry out a detailed error analysis of a transitionbased dependency parser trained on a Hindi dependency treebank. The obvious benet of such an exercise is a potential improvement in parser accuracy More importantly, this can help the treebank developer in validating various guideline choices by reinforcing decisions that were correct and pointing towards possible revisions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.