A sparse $${\varvec{L}}_{2}$$ -regularized support vector machines for efficient natural language learning

Yu-Chieh Wu

doi:10.1007/s10115-013-0615-0

Abstract

Linear kernel support vector machines (SVMs) using either $$L_{1}$$ -norm or $$L_{2}$$ -norm have emerged as an important and wildly used classification algorithm for many applications such as text chunking, part-of-speech tagging, information retrieval, and dependency parsing. $$L_{2}$$ -norm SVMs usually provide slightly better accuracy than $$L_{1}$$ -SVMs in most tasks. However, $$L_{2}$$ -norm SVMs produce too many near-but-nonzero feature weights that are highly time-consuming when computing nonsignificant weights. In this paper, we present a cutting-weight algorithm to guide the optimization process of the $$L_{2}$$ -SVMs toward a sparse solution. Before checking the optimality, our method automatically discards a set of near-but-nonzero feature weight. The final objects can then be achieved when the objective function is met by the remaining features and hypothesis. One characteristic of our cutting-weight algorithm is that it requires no changes in the original learning objects. To verify this concept, we conduct the experiments using three well-known benchmarks, i.e., CoNLL-2000 text chunking, SIGHAN-3 Chinese word segmentation, and Chinese word dependency parsing. Our method achieves 1–10 times feature parameter reduction rates in comparison with the original $$L_{2}$$ -SVMs, slightly better accuracy with a lower training time cost. In terms of run-time efficiency, our method is reasonably faster than the original $$L_{2}$$ -regularized SVMs. For example, our sparse $$L_{2}$$ -SVMs is 2.55 times faster than the original $$L_{2}$$ -SVMs with the same accuracy.

Full Text