Abstract

Detailed algorithms for all‐to‐all broadcast and reduction are given for arrays mapped by binary or binary‐reflected Gray code encoding to the processing nodes of binary cube networks. Algorithms are also given for the local computation of the array indices for the communicated data, thereby reducing the demand for the communications bandwidth. For the Connection Machine system CM‐200, Hamiltonian cycle‐based all‐to‐all communication algorithms yield a performance that is a factor of 2 to 10 higher than the performance offered by algorithms based on trees, butterfly networks, or the Connection Machine router. The peak data rate achieved for all‐to‐all broadcast on a 2,048‐node Connection Machine system CM‐200 is 5.4 Gbyte/s. The index order of the data in local memory depends on implementation details of the algorithms, but it is well defined. If a linear ordering is desired, then including the time for local data reordering reduces the effective peak data rate to 2.5 Gbyte/s.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.