The first Survey and Review article in this issue is “Decay Properties of Spectral Projectors with Applications to Electronic Structure,” by Michele Benzi, Paolo Boito, and Nader Razouk. The linear scaling methods that motivate this work use a reformulation of the conventional electronic structure calculation. Instead of iterating through linear eigenvalue problems, a sequence of spectral projectors, known as density matrices, is computed, from which physical quantities of interest follow directly. Large-scale computations then become feasible, provided that entries in the density matrices are localized; that is, they decay rapidly as we move away from the diagonal. The authors first introduce the basic principles of electronic structure theory and then survey current computational approaches, most of which have arisen in the physical sciences, and their underlying localization requirements. The overview highlights a distinction, often blurred outside the mathematics literature, between physical insight/intuition and mathematical rigor. The key contribution is then to formulate the localization problem mathematically, with transparent assumptions, and develop a unified analytical approach. The tools of matrix analysis and approximation theory are used to justify, where feasible, the localization “results” that are relied upon by today's algorithms. The theory is framed in an asymptotic regime where the system volume increases but the density of particles remains fixed. This type of thermodynamic limit will be familiar to readers who work on multiscale models in physics and chemistry, but may seem strange to numerical analysts who are used to “convergence” in the sense of fixed problem size and arbitrarily fine mesh. The upshot is a sequence of matrices of increasing dimension that look very different from those arising in the discretization of PDEs. After pushing the theory as far as possible, the authors comment on the practical implications of their bounds and raise a number of open questions. They also point out that these decay results may find useful application in other, unrelated, areas: quantum information theory, complex networks, and eigenvalue solvers for tridiagonal matrices. This article illustrates how applied analysis can justify some of the leaps of faith made in the physical sciences and also resolve controversies (section 8.7). It will be of particular interest to readers who work in matrix computation and approximation theory. The second article, “Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint,” by Ronny Luss and Marc Teboulle, shares with the first article the theme of studying algorithms based on large-scale matrix computations for which our current analytical understanding fails to explain the performance seen in practice. Given a symmetric matrix, $A$, most applied mathematicians know that dimension reduction can be achieved by using the dominant eigenvectors (singular vectors) of $A$ to form optimal low-rank approximations. In the statistics literature, modulo a centering operation, the same idea is called principal component analysis (PCA). This type of least-squares approach tries to recover all rows and columns of $A$; equivalently, if we think of $A$ as representing pairwise interactions between nodes in a network, it aims to summarize the full set of interactions. It is, of course, possible to impose a postprocessing threshold to cut off the less important contributions, but it is also reasonable to shoot directly for the important rows/columns or the important network nodes. This alternative leads us to the class of problems discussed here---matrix optimization problems where there is a prescribed upper bound on the number of nonzeros in the required “eigenvectors.” The authors discuss a variety of computational approaches that have been proposed in the literature and introduce a unifying framework, which they call ConGradU, to characterize and analyze them. The basic iterative scheme, summarized in Algorithm 1, has an optimization substep with closed-form solutions in a number of important cases. The authors also include an informative computational example based on the textual content of State of the Union addresses from 1790--2011. For example, Table 6.2 shows the key word stems associated with the three principal factors for both thresholded PCA and the more direct sparsity-seeking alternative. It is clear that the sparse version is summarizing different information. This article will appeal to readers who are keen to keep up with developments in modern optimization and to learn about techniques that fall under the current “Big Data” banner.