Abstract

We present the Structured Weighted Violations Perceptron (SWVP) algorithm, a new perceptron algorithm for structured prediction, that generalizes the Collins Structured Perceptron (CSP, (Collins, 2002)). Unlike CSP, the update rule of SWVP explicitly exploits the internal structure of the predicted labels. We prove that for linearly separable training sets, SWVP converges to a weight vector that separates the data, under certain conditions on the parameters of the algorithm. We further prove bounds for SWVP on: (a) the number of updates in the separable case; (b) mistakes in the non-separable case; and (c) the probability to misclassify an unseen example (generalization), and show that for most SWVP variants these bounds are tighter than those of the CSP special case. In synthetic data experiments where data is drawn from a generative hidden variable model, SWVP provides substantial improvements over CSP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call