Abstract

The efficiency of matrix–vector multiplication is of considerable importance. No current approaches can optimize this sufficiently well under severe time constraints. All major existing methods are based on either manual‐tuning or auto‐tuning and can therefore be time‐consuming. We introduce an alternative model‐driven approach, which is used to map the implementation of matrix–vector multiplication to a target architecture and analytically obtain its parameters. The approach yields the performance that is competitive with optimized Basic Linear Algebra Subprograms (BLAS)‐like dense linear algebra libraries without the need for manual‐tuning or auto‐tuning. Our method provides competitive performance across hardware architectures and can be utilized to obtain single‐threaded and multi‐threaded implementations on multicore processors. We expect that this approach allows the community to progress from valuable engineering solutions to techniques with a broader application.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.