Abstract

This paper describes how media processing programs may be accelerated by using the multimedia instruction extensions that have been added to general-purpose microprocessors. As a concrete example, it describes MAX2, a minimalist, second- generation set of multimedia instructions included in the PA-RISC 2.0 processor architecture. MAX2 implements subword parallel instructions, which utilize the microprocessor's 64-bit wide data paths to process multiple pieces of lower- precision data in parallel. It also includes innovative, new instructions like Mix, which are very useful for matrix transpose and other common data rearrangements. The paper examines some typical multimedia kernels, like block match, matrix transpose, box filter and the IDCT, coded with and without the MAX2 instructions, to illustrate programming techniques for exploiting subword parallelism and superscalar instruction parallelism. The kernels using MAX2 show significant speedups in execution time, and more efficient utilization of the processor's resources.© (1997) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call