Smooth Convex Objects Research Articles

Many recent successes of machine learning went hand in hand with advances in optimization. The exchange of ideas between these fields has worked both ways, with machine learning building on standard optimization procedures such as gradient descent, as well as with new directions in the optimization theory stemming from machine learning applications. In this thesis, we discuss new developments in optimization inspired by the needs and practice of machine learning, federated learning, and data science. In particular, we consider seven key challenges of mathematical optimization and develop a solution to each. Our first contribution is the resolution of a key open problem in Federated Learning: we establish the first theoretical guarantees for the famous Local SGD algorithm in the heterogeneous data regime. As the second challenge, we close the gap between the upper and lower bounds for the theory of two algorithms known as Random Reshuffling (RR) and Shuffle-Once that are widely used in practice, and set as the default data selection strategies for SGD in modern machine learning software. Our third contribution can be seen as a combination of our new theory for proximal RR and Local SGD yielding a new algorithm, which we call FedRR. Unlike Local SGD, FedRR can provably beat gradient descent in communication complexity in the heterogeneous data regime. The fourth challenge is related to the class of adaptive methods. In particular, we present the first parameter-free stepsize rule for gradient descent that provably works for any locally smooth convex objective. The fifth challenge is the development of an algorithm for distributed optimization with quantized updates that preserves linear convergence of gradient descent. Finally, in our sixth and seventh challenges, we develop new VR mechanisms applicable to the non-smooth setting based on proximal operators.

Read full abstract

We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this bottleneck through compression of gradient updates. However, as models become larger, so does the size of the gradient updates. In this paper, we propose an alternate approach to learn from distributed data that quantizes data instead of gradients, and can support learning over applications where the size of gradient updates is prohibitive. Our approach leverages the dependency of the computed gradient on data samples, which lie in a much smaller space in order to perform the quantization in the smaller dimension data space. At the cost of an extra gradient computation, the gradient estimate can be refined by conveying the difference between the gradient at the quantized data point and the original gradient using a small number of bits. Lastly, in order to save communication, our approach adds a layer that decides whether to transmit a quantized data sample or not based on its importance for learning. We analyze the convergence of the proposed approach for smooth convex and non-convex objective functions and show that we can achieve order optimal convergence rates with communication that mostly depends on the data rather than the model (gradient) dimension. We use our proposed algorithm to train ResNet models on the CIFAR-10 and ImageNet datasets, and show that we can achieve an order of magnitude savings over gradient compression methods. These communication savings come at the cost of increasing computation at the learning agent, and thus our approach is beneficial in scenarios where communication load is the main problem.

Read full abstract

Smooth Convex Objects Research Articles

Related Topics

Articles published on Smooth Convex Objects

On Seven Fundamental Optimization Challenges in Machine Learning

Quantization of Distributed Data for Learning

Extended Gauss-Newton and ADMM-Gauss-Newton algorithms for low-rank matrix optimization

Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations

RLC Circuits-Based Distributed Mirror Descent Method

The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization.

Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data

Variable metric random pursuit

Computer vision, camouflage breaking and countershading

Exact and efficient evaluation of the InCircle predicate for parametric ellipses and smooth convex objects

A hybrid numerical-asymptotic boundary integral method for high-frequency acoustic scattering

Quantization of the rolling-body problem with applications to motion planning

Dividing snake algorithm for multiple object segmentation

The 3D visibility complex

Convexity-Based Visual Camouflage Breaking

A global planner for in-hand dextrous re-configuration of rigid objects

Face detection by direct convexity estimation

Computation of penetration between smooth convex objects in three-dimensional space

Resonance spectra of elongated elastic objects

Resonance spectra of elongated elastic objects

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Smooth Convex Objects Research Articles

Related Topics

Articles published on Smooth Convex Objects

On Seven Fundamental Optimization Challenges in Machine Learning

Quantization of Distributed Data for Learning

Extended Gauss-Newton and ADMM-Gauss-Newton algorithms for low-rank matrix optimization

Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations

RLC Circuits-Based Distributed Mirror Descent Method

The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization.

Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data

Variable metric random pursuit

Computer vision, camouflage breaking and countershading

Exact and efficient evaluation of the InCircle predicate for parametric ellipses and smooth convex objects

A hybrid numerical-asymptotic boundary integral method for high-frequency acoustic scattering

Quantization of the rolling-body problem with applications to motion planning

Dividing snake algorithm for multiple object segmentation

The 3D visibility complex

Convexity-Based Visual Camouflage Breaking

A global planner for in-hand dextrous re-configuration of rigid objects

Face detection by direct convexity estimation

Computation of penetration between smooth convex objects in three-dimensional space

Resonance spectra of elongated elastic objects

Resonance spectra of elongated elastic objects