GPU acceleration of ADMM for large-scale quadratic programming

Michel Schubiger,Goran Banjac,John Lygeros

doi:10.1016/j.jpdc.2020.05.021

Abstract

The alternating direction method of multipliers (ADMM) is a powerful operator splitting technique for solving structured convex optimization problems. Due to its relatively low per-iteration computational cost and ability to exploit sparsity in the problem data, it is particularly suitable for large-scale optimization. However, the method may still take prohibitively long to compute solutions to very large problem instances. Although ADMM is known to be parallelizable, this feature is rarely exploited in real implementations. In this paper we exploit the parallel computing architecture of a graphics processing unit (GPU) to accelerate ADMM. We build our solver on top of OSQP, a state-of-the-art implementation of ADMM for quadratic programming. Our open-source CUDA C implementation has been tested on many large-scale problems and was shown to be up to two orders of magnitude faster than the CPU implementation.

Highlights

Convex optimization has become a standard tool in many engineering fields including control [GPM89, RM09], signal processing [MB10], statistics [Hub64, Tib96, CWB08], finance [Mar52, CT06, BMOW14, BBD+17], and machine learning [CV95]
All numerical tests were performed on a Linux-based system with an i9-9900K @ 3.6GHz (8 cores) processor and 64 GB of DDR4 3200Mhz RAM, which is equipped with the NVIDIA GeForce RTX 2080 Ti graphics processing unit (GPU) with 11 GB of VRAM
We have explored the possibilities offered by the massive parallelism of GPUs to accelerate solutions to large-scale quadratic programs (QPs) and have managed to solve problems with hundreds of millions nonzero entries in the problem matrices in only a few seconds

Summary

Introduction

Convex optimization has become a standard tool in many engineering fields including control [GPM89, RM09], signal processing [MB10], statistics [Hub, Tib, CWB08], finance [Mar, CT06, BMOW14, BBD+17], and machine learning [CV95] In some of these applications one seeks solutions to optimization problems whose dimensions can be very large. In the last decade operator splitting methods, such as the proximal gradient method and the alternating direction method of multipliers (ADMM), have gained increasing attention in a wide range of application areas [BPC+11, PB13, BSM+17] These methods scale well with the problem dimensions, can exploit sparsity in the problem data efficiently, and are often parallelizable. Graphics processing units (GPUs) are hardware accelerators that offer an unmatched amount of parallel computational power for their relatively low price They provide far greater memory bandwidths than conventional CPU-based systems, which is especially beneficial in applications that process large amounts of data. The Euclidean projections of x ∈ Rn onto the nonnegative and nonpositive orthants are denoted by x+ := max(x, 0) and x− := min(x, 0), respectively

Problem Description

Optimality and Infeasibility Conditions

OSQP Solver

Termination Criteria

Solving the KKT System

Preconditioning

Parameter Selection

Preconditioned Conjugate Gradient Method

Conjugate Gradient Method

Conjugate directions

Conjugate gradient

GPU Architecture and Programming Strategies

CUDA Architecture

Kernels

Thread hierarchy

Accelerating numerical methods

Segmented reduction

CUDA Libraries

Sparse Matrix Formats

GPU Acceleration of OSQP

OSQP Computational Bottlenecks

Representation of Matrices

Reduced KKT System

Preconditioner

Parameter update

Termination criteria and warm starting

Computing column norms

Matrix post-multiplication

Matrix pre-multiplication

Numerical Results

OSQP Benchmark Problems

MKL Pardiso

Floating-Point Precision

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Parallel and Distributed Computing	Publication Date: Jun 6, 2020
Citations: 39	License type: cc-by

R Discovery Prime

R Discovery Prime

GPU acceleration of ADMM for large-scale quadratic programming

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Similar Papers

Solution of Large-scale Structured Optimization Problems with Schur-complement and Augmented Lagrangian Decomposition Methods

-

02 Aug 2019
02 Aug 2019

A Three-Operator Splitting Perspective of a Three-Block ADMM for Convex Quadratic Semidefinite Programming and Beyond
Liang Chen ... Sanyang Liu
Asia-Pacific Journal of Operational Research | VOL. 37
Liang Chen, et. al.Liang Chen ... Sanyang Liu
19 May 2020
Asia-Pacific Journal of Operational Research | VOL. 37

An alternating direction method of multipliers for the eigenvalue complementarity problem
Joaquim J Júdice ... Valentina Sessa
Optimization Methods and Software | VOL. 36
Joaquim J Júdice, et. al.Joaquim J Júdice ... Valentina Sessa
10 Mar 2020
Optimization Methods and Software | VOL. 36

Second-Order Multiplier Updates to Accelerate Admm Methods in Optimization Under Uncertainty
Jose S Rodriguez ...
Computer Aided Chemical Engineering | VOL. 47
Jose S Rodriguez, et. al.Jose S Rodriguez ...
01 Jan 2019
Computer Aided Chemical Engineering | VOL. 47

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GPU acceleration of ADMM for large-scale quadratic programming

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing