Distributed machine learning based link allocation strategy

Yi Yang,Tenghui Ke,Mingkang Song,Xiayan Zheng,Xijin Li,Zhengguang Wu,Weidong Li,Peng Dai,Jianming Zhou

doi:10.1109/icss55994.2022.00044

Abstract

In the field of machine learning, a machine learning system with multiple nodes is usually used, and each node is used to perform a machine learning distributed training process for a part of the data that is allocated to it and provide a server by performing the machine learning distributed training process. The obtained training result, its machine learning data needs to be transmitted through the network. This paper proposes a link allocation method for distributed machine learning. For machine learning computing nodes distributed across domains, due to inconsistencies in link distance, node performance, and link load, the traffic distribution between computing nodes is unbalanced. Aiming at the complex computing requirements of distributed machine learning, a link pre-allocation method is proposed, which establishes a central server-link-node topology map, integrates link resources, and determines the logical distance of nodes. For the synchronously distributed machine learning training set, preallocate transmission link resources and initiate transmission according to the remaining storage capacity of nodes. In order to improve the network utilization efficiency in the process of machine learning, it can break through the influence of large network transmission delay on the efficiency of distributed machine learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Distributed machine learning based link allocation strategy

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Survey of Scaling Distributed System Via Machine Learning and An Insight on Hadoop and Spark
Atheel Sabih Shaker
IOP Conference Series: Materials Science and Engineering | VOL. 928
Atheel Sabih ShakerAtheel Sabih Shaker
01 Nov 2020
IOP Conference Series: Materials Science and Engineering | VOL. 928

Distributed Machine Learning with a Serverless Architecture
Hao Wang ... Di Niu
-
Hao Wang, et. al.Hao Wang ... Di Niu
01 Apr 2019
01 Apr 2019

Traffic Management for Distributed Machine Learning in RDMA-enabled Data Center Networks
Weihong Yang ... Yang Qin
-
Weihong Yang, et. al.Weihong Yang ... Yang Qin
01 Jun 2021
01 Jun 2021

Distributed Graph Computation Meets Machine Learning
Wencong Xiao ... Zhen Li
IEEE Transactions on Parallel and Distributed Systems | VOL. 31
Wencong Xiao, et. al.Wencong Xiao ... Zhen Li
20 Apr 2020
IEEE Transactions on Parallel and Distributed Systems | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed machine learning based link allocation strategy

Abstract

Talk to us

Similar Papers