Abstract

Recently, parameter servers have shown great performance advantages in training large-scale deep learning (DL) systems. However, most of existing systems implicitly assume that the environment is homogeneous and then adopts stale synchronous parallel (SSP) mechanism in computation. Such assumption does not hold in many real-world cases. Although continuous efforts are paid to accommodate heterogeneity, they merely focus on adjusting learning rate according to easily-accessible iteration staleness. Unfortunately, since the parameters dramatically vary among both servers and workers, learning rate determined by iteration staleness-based analysis will slow down the computation speed and waste the effectiveness of iterations. We reveal that using value staleness, a novel abstraction proposed by us, to analyze the difference of parameters can better reflect the effectiveness of each iteration. The inspiration of value staleness comes from our comparison among workers and servers, while it is often neglected by previous work. These observations motivate us to propose a value-staleness-aware stochastic gradient descent (VSGD) scheme, which adopts the value staleness to apply dynamic learning rate assignments for each iteration in parameter servers. The evaluation under various benchmarks demonstrate that VSGD significantly outperforms the previous schemes, for instance, the computation speed is accelerated to 1.26x and the error rate is minimized to 75% compared with SSP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.