Learning a good similarity measure for large-scale high-dimensional data is a crucial task in machine learning applications, yet it poses a significant challenge. Distributed minibatch stochastic gradient descent (SGD) serves as an efficient optimization method in large-scale distributed training, allowing linear speedup in proportion to the number of workers. However, communication efficiency in distributed SGD requires a sufficiently large minibatch size, presenting two distinct challenges. Firstly, a large minibatch size leads to high memory usage and computational complexity during parallel training of high-dimensional models. Second, a larger batch size of data reduces the convergence rate. To overcome these challenges, we propose an efficient distributed sparse relative similarity learning framework EDSRSL . This framework integrates two strategies: local minibatch SGD and sparse relative similarity learning. By effectively reducing the number of updates through synchronous delay while maintaining a large batch size, we address the issue of high computational cost. Additionally, we incorporate sparse model learning into the training process, significantly reducing computational cost. This paper also provides theoretical proof that the convergence rate does not decrease significantly with increasing batch size. Various experiments on six high-dimensional real-world datasets demonstrate the efficacy and efficiency of the proposed algorithms, with a communication cost reduction of up to \(90.89\%\) and a maximum wall time speedup of \(5.66\times\) compared to the baseline methods.
Read full abstract