Abstract

ABSTRACTLinear least squares problems (LLSPs) routinely arise in many scientific and engineering problems. One of the fastest ways to solve LLSPs involves performing calculations in parallel on graphics processing units (GPUs). However, GPU algorithms are typically designed for one GPU architecture and may be suboptimal or unusable on another GPU. To design optimal algorithms for any GPU with little need for modifying code, tuneable parameters can simplify the transition of GPU algorithms to different GPU architectures. In this paper, we investigate the benefits of using derivative-free optimization (DFO) and simulation optimization (SO) to systematically optimize tuneable parameters for a GPU or hybrid CPU/GPU LLSP solvers. Computational experiments show that both DFO and SO can be effective tools for determining optimal tuning parameters that can speed up the performance of the popular LLSP solver MAGMA by about 1.8x, compared to MAGMA's default parameters for large tall and skinny matrices. Using DFO solvers, we were able to identify optimal parameters after enumerating an order of magnitude fewer parameter combinations than with direct enumeration. Additionally, the proposed approach is faster than a state-of-the-art autotuner and provides better tuning parameters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call