Abstract

Penalized regression-based clustering model (PRClust) is an extension of “sum-of-norms” clustering model. Three previously proposed heuristic algorithms for solving PRClust are: (1) DC-CD, which combines the difference of convex programming (DC) and a coordinate-wise descent algorithm (CD), (2) DC-ADMM, which combines DC with the alternating direction method of multipliers (ADMM), and (3) ALT, which uses alternate optimization. DC-CD uses $$ p \times \left( {n \times \left( {n - 1} \right)} \right)/2 $$ scalar slack variables to solve PRClust, where n is the number of data and p is the number of their features. In each iteration of DC-CD, these slack variables and cluster centers are updated using a second-order cone programming (SOCP). DC-ADMM uses $$ p \times n \times \left( {n - 1} \right) $$ scalar slack variables. In each iteration of DC-ADMM, these slack variables and cluster centers are updated with a standard ADMM. In this paper, first, PRClust is reformulated into an equivalent model. Then, a novel heuristic algorithm is proposed to solve the reformulated model. Our proposed algorithm needs only $$ \left( {n \times \left( {n - 1} \right)} \right)/2 $$ scalar slack variables which are much less than those of DC-CD and DC-ADMM, and updates them using a simple equation in each iteration of the algorithm. Therefore, updating slack variables in our proposed algorithm is less time-consuming than that of DC-CD and DC-ADMM. Our proposed algorithm updates only cluster centers using an unconstrained convex quadratic problem. Therefore, our proposed unconstrained convex quadratic problem is much smaller than the SOCP of DC-CD which is used to update both cluster centers and slack variables. Meanwhile, ALT updates cluster centers using a SOCP, while our proposed algorithm updates cluster centers using an unconstrained convex quadratic problem with the same number of variables. Solving an unconstrained convex quadratic problem is less time-consuming than a SOCP with the same number of variables. Our experimental results on 12 datasets confirm that the runtime of our proposed algorithm is better than that of DC-ADMM and DC-CD.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.