Abstract
Geo-distributed machine learning (Geo-DML) adopts a hierarchical training architecture that includes local model synchronization within the data center and global model synchronization (GMS) across data centers. However, the scarce and heterogeneous wide area network (WAN) bandwidth can become the bottleneck of training performance. An intelligent optical device (i.e., reconfigurable optical all-drop multiplexer) makes the modern WAN topology reconfigurable, which has been ignored by most approaches to speed up Geo-DML training. Therefore, in this paper, we study scheduling algorithms to accelerate model synchronization for Geo-DML training with consideration of the reconfigurable optical WAN topology. Specifically, we use an aggregation tree for each Geo-DML training job, which helps to reduce model synchronization communication overhead across the WAN, and propose two efficient algorithms to accelerate GMS for Geo-DML: MOptree, a model-based algorithm for single job scheduling, and MMOptree for multiple job scheduling, aiming to reconfigure the WAN topology and trees by reassigning wavelengths on each fiber. Based on the current WAN topology and job information, mathematical models are built to guide the topology reconstruction, wavelength, and bandwidth allocation for each edge of the trees. The simulation results show that MOptree completes the GMS stage up to 56.16% on average faster than the traditional tree without optical-layer reconfiguration, and MMOptree achieves up to 54.6% less weighted GMS time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.