Implementation and Evaluation of Quadruple Precision BLAS Functions on GPUs

Daichi Mukunoki,Daisuke Takahashi

doi:10.1007/978-3-642-28151-8_25

Implementation and Evaluation of Quadruple Precision BLAS Functions on GPUs

Daichi Mukunoki, Daisuke Takahashi

https://doi.org/10.1007/978-3-642-28151-8_25

Copy DOI

Publication Date: Jan 1, 2012

Citations: 7

Affiliation: University of Tsukuba

#Basic Linear Algebra Subprograms #Quadruple Precision + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We implemented the quadruple precision Basic Linear Algebra Subprograms (BLAS) functions, AXPY, GEMV and GEMM, on graphics processing units (GPUs), and evaluated their performance. We used DD-type quadruple precision operations, which combine two double precision values to represent a quadruple precision value. On an NVIDIA Tesla C1060, our BLAS functions are up to approximately 30 times faster than the existing quadruple precision BLAS on an Intel Core i7 920. Additionally, the execution time of quadruple precision AXPY takes only approximately 2.7 times longer than that of double precision AXPY on the Tesla C1060. We have shown that quadruple precision BLAS operations are suitable for GPUs.

Full Text