Abstract
The time spent in communication operations is a major factor in determining the scalability of parallel applications. Tuning the parameters of a communication library can be used to adapt its characteristics to a particular platform, minimizing the communication time of an application. The goal of this paper is to improve theoretical and practical understanding of how performance improvements of point-to-point operations propagate to collective communication operations. We derive formulas to determine the expected improvement of a collective operation based on the improvement observed for a point-to-point communication using Hockney's model and the LogGP model. Our results indicate that many collective algorithms will inherently see a lower performance improvements compared to the improvement observed for point-to-point operations. Our evaluation shows for most test cases a good match between the predictions made by our models and the observed data, but also identifies multiple reasons for potential disparity between theory and practice.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.