Abstract

Modern workstations are equipped with fast cache memory to enable the CPU to access the relatively slow main memory without noticeable delay. However, two typical cache characteristics (limited associativeness and power of two based memory address mapping on cache lines) cause the complete class of separable image processing algorithms to give the worst possible performance regarding data cache utilization on large images. We present three methods based on transposing the image to improve the data cache usage for both write-through and write-back caches. Experiments with a 3 × 3 uniform filter and the fast Fourier transform performed on a range of Sun workstations show that the proposed methods improve the performance considerably.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.