This issue contains two Survey and Review papers. The first, by Qinmeng Zou and Frédéric Magoulès, is “Delayed Gradient Methods for Symmetric and Positive Definite Linear Systems.” Gradient methods are the oldest and simplest algorithms to minimize a real objective function (f(x)). The ((n+1))st approximation to the minimizer is defined as (x_n+1 = x_n -\alpha_n g_n), where \(g_n\) is the gradient (\nabla f(x_n)) and (\alpha_n>0) is a steplength that depends on the specific method being used. Old and simple as the idea may be, gradient algorithms are of much current interest in the literature; for instance, they played a major role in the influential survey devoted to optimization in machine learning published in Volume 60, Issue 2 of this journal. The paper by Zou and Magoulès focuses on the quadratic case $f(x) = (1/2)x^TAx-b^Tx$ ($A$ symmetric and positive definite), where finding the minimizer is of course equivalent to solving the linear system $Ax=b$ and the gradient $g_n$ coincides with the residual $Ax_n-b$. Well-known strategies to determine the steplength include steepest descent, where $\alpha_n$ is chosen so as to minimize $f(x_{n+1})$, and minimal gradient (or minimal residual), where one rather minimizes the length of $g_{n+1}$. In both of these strategies, the value of $\alpha_n$ depends only on $g_n$. The term “delayed” in the title of the article refers to methods where the recipe to determine $\alpha_n$ includes information from past gradients $g_{n-1}, g_{n-2}$, \dots, and/or past stepsizes $\alpha_{n-1}$, $\alpha_{n-2}$, \dots. The numerical experiments reported clearly indicate that such delayed strategies may give rise to algorithms that are competitive with conjugate gradient methods in large ill-conditioned problems. The paper presents a neat summary of the recent results in this area and of the techniques used to derive them. Mark Van der Boor, Sem C. Borst, Johan S. H. Van Leeuwaarden, and Debankur Mukherjee are the authors of the second paper, “Scalable Load Balancing in Networked Systems: A Survey of Recent Advances.” The problem under consideration is as follows. A dispatcher receives clients that arrive randomly, and her job is to direct them to one of $N\gg 1$ servers. The time required by each client to be served is also random, so that a queue (of random length) of waiting clients will be formed at each server. How should the dispatcher proceed to expedite the service? As one would expect in these days of cloud networks and data systems with massive number of individual centers, the problem is currently receiving much attention in the literature. A strategy that suggests itself is the so-called “join the shortest queue” (JSQ), where on their arrival clients are directed to the server having the shortest queue. While JSQ has been proved to possess several favorable properties, it may not be the best option, due to communication overheads: each time a client arrives, the dispatcher has to communicate with all servers to find the lengths of their queues. The paper analyzes, in the limit $N\rightarrow \infty$, many alternative strategies. Nonspecialists will have little difficulty in reading the easily accessible first few sections and may be interested in discovering how even small tweaks in the algorithms may result in substantial improvements of their performance.
Read full abstract