Abstract

A new design for parallel and distributed processing elements (PEs) is proposed to configure Beneš networks based on a novel parallel algorithm that can realise full and partial permutations in a unified manner with very little overhead time and extra hardware. The proposed design reduces the hardware complexity of PEs from O ( N 2 ) to O ( N ( lo g 2 N ) 2 ) due to a distributed architecture. In the proposed design, asynchronous operation was introduced in parts to reduce the time complexity per PE stage down to O(1) within a certain N, while it takes O ( lo g 2 N ) time per PE stage in conventional algorithms. A prototype parallel was constructed and PEs were distributed in a field programmable gate array to investigate performance for the switch size of N = 4 to 32. The experimental results demonstrate that the proposed design outperforms a recent method by at least several times in terms of hardware and processing time complexities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call