Abstract

In this paper we derive an efficient realization of the edge detection algorithm on a target architecture with parallelism on two levels. Our target architecture is a processor array where parallelism is achieved 1) within the processing elements by sub-word parallelism (SWP) and 2) within the processor array by an arrangement of several processing elements. We exploit the parallelism on both levels of our processor array by a parameterized two-level partitioning of the algorithm. To obtain a significant speed-up such partitioning parameters are selected which match the target architecture and require a minimum number of additional instructions for SWP. Through this partitioning communication within the processor array appears to be necessary on a large scale. By a detailed examination, which is automatically performed by integer linear programming, we extract and eliminate redundant communication. Hence, our realization of the edge detection algorithm is efficient in terms of energy consumption caused by communication within the processor array. And we obtain a significant speed-up by exploiting both levels of parallelism

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call