Abstract

Process management libraries and runtime environments serve an important role in the HPC application lifecycle. This work provides a roadmap for implementing a high-performance PMIx based software stacks and targets four performance-critical areas presenting novel codesigned solutions that significantly improve application performance during initialization and wire-up at scale.First, the new locking and thread-safety schemes of the PMIx on-host communication are designed demonstrating up to 66x reduction in PMIx_Get latency.Second, the optimizations of protocols involved in the wire-up procedure are proposed. Specific improvements in the UCX endpoint address representation, the layout of PMIx metadata, and the use of Little-Endian Base 128 encoding decreased the volume of inter-node data exchanged by up to 8.6x.Third, a modification of the Bruck concatenation algorithm is presented that scales better than ring- and tree-based implementations currently used in resource managers for PMIx data exchange.Lastly, an out-of-band channel leveraging the high-performance fabric is evaluated demonstrating orders of magnitude performance improvement compared to the existing implementation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.