On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods

  • Abstract
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

On the Improvement of the Barzilai–Borwein Step Size in Variance Reduction Methods

ReferencesShowing 10 of 18 papers
  • Open Access Icon
  • Cite Count Icon 123
  • 10.1109/allerton.2016.7852377
Stochastic Frank-Wolfe methods for nonconvex optimization
  • Sep 1, 2016
  • Sashank J Reddi + 3 more

  • Open Access Icon
  • Cite Count Icon 383
  • 10.1093/imanum/13.3.321
On the Barzilai and Borwein choice of steplength for the gradient method
  • Jan 1, 1993
  • IMA Journal of Numerical Analysis
  • Marcos Raydan

  • Cite Count Icon 10
  • 10.1007/s11590-020-01550-x
A linearly convergent stochastic recursive gradient method for convex optimization
  • Feb 17, 2020
  • Optimization Letters
  • Yan Liu + 2 more

  • Open Access Icon
  • Cite Count Icon 13
  • 10.1609/aaai.v32i1.11599
Stochastic Non-Convex Ordinal Embedding With Stabilized Barzilai-Borwein Step Size
  • Apr 29, 2018
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Ke Ma + 6 more

  • Open Access Icon
  • Cite Count Icon 337
  • 10.1007/s00211-004-0569-y
Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming
  • Feb 16, 2005
  • Numerische Mathematik
  • Yu-Hong Dai + 1 more

  • Open Access Icon
  • Cite Count Icon 1008
  • 10.1137/s1052623497330963
Nonmonotone Spectral Projected Gradient Methods on Convex Sets
  • Jan 1, 2000
  • SIAM Journal on Optimization
  • Ernesto G Birgin + 2 more

  • Cite Count Icon 726
  • 10.1137/s1052623494266365
The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem
  • Feb 1, 1997
  • SIAM Journal on Optimization
  • Marcos Raydan

  • Cite Count Icon 37
  • 10.1007/978-3-319-17689-5_3
A Positive Barzilai–Borwein-Like Stepsize and an Extension for Symmetric Linear Systems
  • Jan 1, 2015
  • Yu-Hong Dai + 2 more

  • Cite Count Icon 102
  • 10.1093/imanum/23.3.377
Alternate minimization gradient method
  • Jul 1, 2003
  • IMA Journal of Numerical Analysis
  • Y.-H Dai

  • Open Access Icon
  • Cite Count Icon 22
  • 10.1137/19m1256919
On the Adaptivity of Stochastic Gradient-Based Optimization
  • Jan 1, 2020
  • SIAM Journal on Optimization
  • Lihua Lei + 1 more

Similar Papers
  • Research Article
  • 10.1080/00295639.2024.2302764
Effectiveness of Radiation Transport Variance Reduction Methods for Wide-Area Environmental Contamination Assay Applications
  • Feb 9, 2024
  • Nuclear Science and Engineering
  • E Asano + 1 more

This study compares the accuracy, efficiency, and reliability of variance reduction (VR) methods for Monte Carlo radiation transport simulations involving wide-area ground plane (i.e., “surface”) and buried (i.e., “volumetric”) gamma source emissions from environmental soil. The simulation models are idealized external exposure scenarios intended as a basis for deriving site-specific dose-based or carcinogenic risk–based regulatory limits in the radiological site remediation process. These simulations are computationally resource intensive since particle tracks are transported from an extremely large source region to a relatively small detector region. For each simulation, several VR methods are compared with metrics of accuracy, efficiency, and reliability. The MCNP deterministic transport (DXTRAN) VR method was most effective for problems involving sources emitting low-energy gamma rays, and a coupled multicode method was more effective for problems involving sources emitting higher-energy gamma rays that undergo significant attenuation in the soil.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/0266-352x(87)90046-2
Reliability assessment of test embankments on soft Bangkok clay by variance reduction and nearest-neighbor methods
  • Jan 1, 1987
  • Computers and Geotechnics
  • D.T Bergado + 3 more

Reliability assessment of test embankments on soft Bangkok clay by variance reduction and nearest-neighbor methods

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s10994-022-06265-x
SVRG meets AdaGrad: painless variance reduction
  • Nov 10, 2022
  • Machine Learning
  • Benjamin Dubois-Taine + 4 more

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate. To address this, we use ideas from adaptive gradient methods to propose AdaSVRG, which is a more-robust variant of SVRG, a common VR method. AdaSVRG uses AdaGrad, a common adaptive gradient method, in the inner loop of SVRG, making it robust to the choice of step-size. When minimizing a sum of n smooth convex functions, we prove that a variant of AdaSVRG requires \(\tilde{O}(n + 1/\epsilon )\) gradient evaluations to achieve an \(O(\epsilon )\)-suboptimality, matching the typical rate, but without needing to know problem-dependent constants. Next, we show that the dynamics of AdaGrad exhibit a two-phase behavior – the step-size remains approximately constant in the first phase, and then decreases at a \(O\left( {1}/{\sqrt{t}}\right)\) rate. This result maybe of independent interest, and allows us to propose a heuristic that adaptively determines the length of each inner-loop in AdaSVRG. Via experiments on synthetic and real-world datasets, we validate the robustness and effectiveness of AdaSVRG, demonstrating its superior performance over standard and other “tune-free” VR methods.

  • Research Article
  • Cite Count Icon 31
  • 10.1016/j.ejor.2020.08.058
A faster path-based algorithm with Barzilai-Borwein step size for solving stochastic traffic equilibrium models
  • Sep 6, 2020
  • European Journal of Operational Research
  • Muqing Du + 2 more

A faster path-based algorithm with Barzilai-Borwein step size for solving stochastic traffic equilibrium models

  • Research Article
  • Cite Count Icon 3
  • 10.1093/biomet/61.1.143
Variance reduction and nonnormality
  • Jan 1, 1974
  • Biometrika
  • W Keith Hastings

There are many available variance reduction methods and these are described in the sampling theory literature in such works as Kish (1965) and Raj (1968), and in the literature of Monte Carlo methods (Hammersley & Handscomb, 1964) and in the survey paper by Halton (1970). In sampling problems where a mean is to be estimated, these variance reduction methods may induce extreme nonnormality in the distribution of the resulting estimate which in turn may cause difficulties with assessment of the error of the estimate and with other inferential procedures. We have emphasized the effects on ,81 = 2/(J6 of four major techniques of variance reduction, namely, importance or probability proportional to estimated size sampling, regression, the use of conditional expectation, and stratification. The results are readily extended to include the effects on ,82 = ,04/o-4. To facilitate comparisons with population values, (J2 and ft1 are expressed on a unit observation basis throughout.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.pnucene.2016.03.023
Development of new variance reduction methods based on weight window technique in RMC code
  • Apr 6, 2016
  • Progress in Nuclear Energy
  • Xiao Fan + 2 more

Development of new variance reduction methods based on weight window technique in RMC code

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.enggeo.2022.106804
Evaluation of the scale of fluctuation based on variance reduction method
  • Aug 4, 2022
  • Engineering Geology
  • Suozhu Fei + 5 more

Evaluation of the scale of fluctuation based on variance reduction method

  • Research Article
  • Cite Count Icon 1
  • 10.1080/00207727508941861
Dynamic models for variance reduction in particle transport simulation†
  • Aug 1, 1975
  • International Journal of Systems Science
  • Moshe Goldstein + 1 more

A new formulation of the Monte-Carlo method for dynamic models of three-dimensional particle transport -within a homogeneous media is presented. A basic Monte-Carlo procedure is defined ns the so-called ‘ basic model ’. Then each of two variance reduction methods (statistical weighting and exponential transform) is formulated by alterations in the basic modal, so as to produce so-called ‘ biased models’. Each biased model is formally shown to produce the same estimate of the parameter of interest (detector response) as the basic model. This is done using methods of comparison of dynamic models derived from system theory. It is also shown that the two variance reduction methods are significant, in reducing the running time of the problem.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/cso.2011.123
Efficient Simulations for Exotic Options under NIG Model
  • Apr 1, 2011
  • Yongjia Xu + 2 more

This paper discusses the Monte Carlo and quasi-Monte Carlo methods combined with some variance reduction techniques for exotic option pricing where the log returns of the underlying asset prices follow both the NIG and the normal distributions. An arithmetic Asian option and an Up-and-Out Asian option are considered in this paper. Our test results show that variance reduction methods can usually reduce variances significantly if they are chosen carefully. The results also show that the (randomized) quasi-Monte Carlo method is more efficient than the Monte Carlo method if both are combined with the same variance reduction method.

  • Research Article
  • Cite Count Icon 2
  • 10.1063/1.5081446
Application of the interacting particle system method to piecewise deterministic Markov processes used in reliability.
  • Jun 1, 2019
  • Chaos (Woodbury, N.Y.)
  • Hassane Chraibi + 3 more

Variance reduction methods are often needed for the reliability assessment of complex industrial systems, we focus on one variance reduction method in a given context, that is, the interacting particle system (IPS) method used on piecewise deterministic Markov processes (PDMPs) for reliability assessment. The PDMPs are a very large class of processes which benefit from high modeling capacities, they can model almost any Markovian phenomenon that does not include diffusion. In reliability assessment, the PDMPs modeling industrial systems generally involve low jump rates and jump kernels favoring one safe arrival, we call such model a "concentrated PDMP." Used on such concentrated PDMPs, the IPS is inefficient and does not always provide a variance reduction. Indeed, the efficiency of the IPS method relies on simulating many different trajectories during its propagation steps, but unfortunately, concentrated PDMPs are likely to generate the same deterministic trajectories over and over. We propose an adaptation of the IPS method called IPS+M that reduces this phenomenon. The IPS+M consists in modifying the propagation steps of the IPS, by conditioning the propagation to avoid generating the same trajectories multiple times. We prove that, compared to the IPS, the IPS+M method always provides an estimator with a lower variance. We also carry out simulations on two-components systems that confirm these results.

  • Research Article
  • Cite Count Icon 20
  • 10.1016/j.camwa.2023.04.024
Density-extrapolation Global Variance Reduction (DeGVR) method for large-scale radiation field calculation
  • Aug 1, 2023
  • Computers & Mathematics with Applications
  • Qingquan Pan + 4 more

Density-extrapolation Global Variance Reduction (DeGVR) method for large-scale radiation field calculation

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.ejor.2023.09.018
Improving uplift model evaluation on randomized controlled trial data
  • Sep 17, 2023
  • European Journal of Operational Research
  • Björn Bokelmann + 1 more

Improving uplift model evaluation on randomized controlled trial data

  • Research Article
  • 10.1088/1757-899x/603/3/032094
Assessing the Efficiency of Variance Reduction Methods in the Construction Project Network Simulation
  • Sep 1, 2019
  • IOP Conference Series: Materials Science and Engineering
  • Slawomir Biruk + 2 more

The Monte Carlo simulation has become a standard tool in the practice of planning risk-affected projects. In particular, it is frequently applied to testing the impact of risk on schedule networks with deterministic structures and random activity durations defined by distribution functions of any type. The accuracy of simulation-based estimates can be improved by increasing the number of replications or by applying variance reduction methods. This paper focuses on the latter and analyzes the impact of the variance reduction method on the scale of the standard error of the estimated mean value of project duration. Three methods of variance reduction were examined: the Quasi-Monte Carlo with Weyl sequence sampling, the antithetic variates, and the Latin Hypercube Sampling. The object of the simulation experiment was a sample network model with the activity durations of triangular distributions. This type of distribution was selected as it is often applied in the practice of construction scheduling to capture the variability of operating conditions in the absence of grounds for assuming other types of distribution. The results of the sample simulation provided an indirect proof that applying variance reduction measures may reduce the time of the simulation experiment (reduced number of replications) as well as improve the confidence in the estimates of the model’s characteristics.

  • Conference Article
  • Cite Count Icon 2
  • 10.1117/12.2213470
Methods for variance reduction in Monte Carlo simulations
  • Mar 7, 2016
  • Gabriel Elpers + 5 more

Monte Carlo simulations are widely considered to be the gold standard for studying the propagation of light in turbid media. However, due to the probabilistic nature of these simulations, large numbers of photons are often required in order to generate relevant results. Here, we present methods for reduction in the variance of dose distribution in a computational volume. Dose distribution is computed via tracing of a large number of rays, and tracking the absorption and scattering of the rays within discrete voxels that comprise the volume. Variance reduction is shown here using quasi-random sampling, interaction forcing for weakly scattering media, and dose smoothing via bi-lateral filtering. These methods, along with the corresponding performance enhancements are detailed here.

  • Conference Article
  • 10.1109/icvr55215.2022.9847937
The Improved Adaptive Algorithm of Deep Learning with Barzilai-Borwein Step Size
  • May 26, 2022
  • Zhi-Jun Wang + 5 more

To solve the problem that it is difficult to determine the learning rate when training a neural network model, this paper proposes an improved adaptive algorithm based on the Barzilai-Borwein (BB) step size. In this paper, the new algorithm accelerates the model's training through the second-order momentum and adapts the learning rate according to the BB step size. We also set an adequate range for the learning rate to ensure the stability of adaptive adjustment and reduce the error of step size. Compared with different algorithms in a series of popular models, the new algorithm significantly avoids the tediousness of manually adjusting the learning rate and helps to improve the convergence speed. The results show that the new algorithm is feasible and effective.

More from: Applied Mathematics & Optimization
  • New
  • Research Article
  • 10.1007/s00245-025-10297-9
Robust pointwise second-order necessary conditions for singular stochastic optimal control with model uncertainty
  • Nov 6, 2025
  • Applied Mathematics & Optimization
  • Guangdong Jing

  • New
  • Research Article
  • 10.1007/s00245-025-10322-x
Singleton Sets Random Attractors for Lattice Dynamical Systems Driven by a Fractional Brownian Motion Revisited
  • Nov 6, 2025
  • Applied Mathematics & Optimization
  • Anhui Gu

  • New
  • Research Article
  • 10.1007/s00245-025-10325-8
Mean-Field Games of Optimal Stopping: Master Equation and Weak Equilibria
  • Nov 4, 2025
  • Applied Mathematics & Optimization
  • Dylan Possamaï + 1 more

  • Research Article
  • 10.1007/s00245-025-10333-8
Multi-bump Type Nodal Solutions for a Fractional p-Laplacian Logarithmic Schrödinger Equation with Deepening Potential Well
  • Oct 28, 2025
  • Applied Mathematics & Optimization
  • Lin Li + 2 more

  • Research Article
  • 10.1007/s00245-025-10326-7
Optimal Control for a Quasistatic Viscoelastic Contact Problem
  • Oct 28, 2025
  • Applied Mathematics & Optimization
  • Dong-Ling Cai + 2 more

  • Research Article
  • 10.1007/s00245-025-10336-5
Global Existence and Asymptotic Behavior for a Two-Species Chemotaxis-Competition System with Loop and Singular Sensitivity
  • Oct 28, 2025
  • Applied Mathematics & Optimization
  • Min Jiang + 1 more

  • Research Article
  • 10.1007/s00245-025-10335-6
On a Chemotaxis-Generalized Navier–Stokes System with Rotational Flux: Global Classical Solutions and Stabilization
  • Oct 25, 2025
  • Applied Mathematics & Optimization
  • Chao Jiang + 2 more

  • Research Article
  • 10.1007/s00245-025-10317-8
Equilibrium Configurations of a 3D Fluid-Beam Interaction Problem
  • Oct 13, 2025
  • Applied Mathematics & Optimization
  • Vincenzo Bianca + 2 more

  • Research Article
  • 10.1007/s00245-025-10320-z
Time-Inconsistent Linear-Quadratic Social Optima for Large Population System
  • Oct 13, 2025
  • Applied Mathematics & Optimization
  • Haiyang Wang + 1 more

  • Research Article
  • 10.1007/s00245-025-10338-3
Modelling of Poro-Acoustic Media
  • Oct 1, 2025
  • Applied Mathematics & Optimization
  • Isabelle Gruais

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon