Abstract
We present a high-performance C++ library for high but fixed precision (128 to 512 bit) integer arithmetic and symbolic polynomial computations. While the large integer and polynomial computation parts of the library can be used independently optimized kernels for symbolic polynomials with large integer coefficients are provided. The kernels were manually optimized in assembly language for the x86-64 and power64 architectures. Our main target application is high-temperature series expansions which requires inner products of large vectors of polynomials with large integer coefficients. For this purpose we implemented a tunable hybrid CPU/GPU inner product function using OpenMP and NVIDIA CUDA with inline PTX assembly. This way we make optimal use of today’s and upcoming hybrid supercomputers and attain 49% of the peak performance of the current NVIDIA Kepler GPU. Compared to a pure CPU solution using the GNU Multiple Precision Arithmetic Library (GMP) we gain a speedup of 13x for a pure CPU inner product and 38x using a GPU accelerator.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.