A Divide-and-Conquer Method for Scalable Robust Multitask Learning.

Yan Pan,Rongkai Xia,Jian Yin,Ning Liu

doi:10.1109/tnnls.2015.2406759

Abstract

Multitask learning (MTL) aims at improving the generalization performance of multiple tasks by exploiting the shared factors among them. An important line of research in the MTL is the robust MTL (RMTL) methods, which use trace-norm regularization to capture task relatedness via a low-rank structure. The existing algorithms for the RMTL optimization problems rely on the accelerated proximal gradient (APG) scheme that needs repeated full singular value decomposition (SVD) operations. However, the time complexity of a full SVD is O(min(md(2),m(2)d)) for an RMTL problem with m tasks and d features, which becomes unaffordable in real-world MTL applications that often have a large number of tasks and high-dimensional features. In this paper, we propose a scalable solution for large-scale RMTL, with either the least squares loss or the squared hinge loss, by a divide-and-conquer method. The proposed method divides the original RMTL problem into several size-reduced subproblems, solves these cheaper subproblems in parallel by any base algorithm (e.g., APG) for RMTL, and then combines the results to obtain the final solution. Our theoretical analysis indicates that, with high probability, the recovery errors of the proposed divide-and-conquer algorithm are bounded by those of the base algorithm. Furthermore, in order to solve the subproblems with the least squares loss or the squared hinge loss, we propose two efficient base algorithms based on the linearized alternating direction method, respectively. Experimental results demonstrate that, with little loss of accuracy, our method is substantially faster than the state-of-the-art APG algorithms for RMTL.

Full Text