Abstract
Abstract We give a general framework for inference in spanning tree models. We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms. These algorithms are easy to implement with or without automatic differentiation software. We motivate the development of our framework with several cautionary tales of previous research, which has developed numerous inefficient algorithms for computing expectations and their gradients. We demonstrate how our framework efficiently computes several quantities with known algorithms, including the expected attachment score, entropy, and generalized expectation criteria. As a bonus, we give algorithms for quantities that are missing in the literature, including the KL divergence. In all cases, our approach matches the efficiency of existing algorithms and, in several cases, reduces the runtime complexity by a factor of the sentence length. We validate the implementation of our framework through runtime experiments. We find our algorithms are up to 15 and 9 times faster than previous algorithms for computing the Shannon entropy and the gradient of the generalized expectation objective, respectively.
Highlights
Dependency trees are a fundamental combinatorial structure in natural language processing
We focus on edge-factored models where the probability of a dependency tree is proportional to the product the weights of its edges
We show that the gradient of the generalized expectation (GE) criterion can be evaluated in O N 3
Summary
Dependency trees are a fundamental combinatorial structure in natural language processing. The result is still used in more recent work (Ma and Hovy, 2017; Liu and Lapata, 2018) We build upon this tradition through a framework for computing expectations of a rich family of functions under a distribution over trees. Our framework is motivated by the lack of a unified approach for computing expectations over spanning trees in the literature. We believe this gap has resulted in the publication of numerous inefficient algorithms. We have released a reference implementation at the following URL: https://github .com/rycolab/tree expectations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Transactions of the Association for Computational Linguistics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.