Abstract

Graph based dependency parsing is inefficient when handling non-local features due to high computational complexity of inference. In this paper, we proposed an exact and efficient decoding algorithm based on the Branch and Bound (B&B) framework where non-local features are bounded by a linear combination of local features. Dynamic programming is used to search the upper bound. Experiments are conducted on English PTB and Chinese CTB datasets. We achieved competitive Unlabeled Attachment Score (UAS) when no additional resources are available: 93.17% for English and 87.25% for Chinese. Parsing speed is 177 words per second for English and 97 words per second for Chinese. Our algorithm is general and can be adapted to non-projective dependency parsing or other graphical models.

Highlights

  • For graph based projective dependency parsing, dynamic programming (DP) is popular for decoding due to its efficiency when handling local features. It performs cubic time parsing for arc-factored models (Eisner, 1996; McDonald et al, 2005a) and biquadratic time for higher order models with richer sibling and grandchild features (Carreras, 2007; Koo and Collins, 2010)

  • We propose a novel Branch and Bound (B&B) algorithm for efficient parsing with various non-local features

  • For general high order models with non-local features, we propose to use Branch and Bound (B&B)

Read more

Summary

Introduction

For graph based projective dependency parsing, dynamic programming (DP) is popular for decoding due to its efficiency when handling local features It performs cubic time parsing for arc-factored models (Eisner, 1996; McDonald et al, 2005a) and biquadratic time for higher order models with richer sibling and grandchild features (Carreras, 2007; Koo and Collins, 2010). Since the reranking quality is bounded by the oracle performance of candidates, some work has combined candidate generation and reranking steps using cube pruning (Huang, 2008; Zhang and McDonald, 2012) to achieve higher oracle performance They parse a sentence in bottom up order and keep the top k derivations for each span using k best parsing (Huang and Chiang, 2005). The disadvantage is that it tends to compute non-local features as early as possible so that the decoder can utilize that information at internal spans, it may miss long historical features such as long dependency chains

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.