Enhancing finite-difference-based derivative-free optimization with machine learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Enhancing finite-difference-based derivative-free optimization with machine learning

Similar Papers
  • Conference Article
  • Cite Count Icon 17
  • 10.24963/ijcai.2018/315
Experienced Optimization with Reusable Directional Model for Hyper-Parameter Search
  • Jul 1, 2018
  • Yi-Qi Hu + 2 more

Hyper-parameter selection is a crucial yet difficult issue in machine learning. For this problem, derivative-free optimization has being playing an irreplaceable role. However, derivative-free optimization commonly requires a lot of hyper-parameter samples, while each sample could have a high cost for hyper-parameter selection due to the costly evaluation of a learning model. To tackle this issue, in this paper, we propose an experienced optimization approach, i.e., learning how to optimize better from a set of historical optimization processes. From the historical optimization processes on previous datasets, a directional model is trained to predict the direction of the next good hyper-parameter. The directional model is then reused to guide the optimization in learning new datasets. We implement this mechanism within a state-of-the-art derivative-free optimization method SRacos, and conduct experiments on learning the hyper-parameters of heterogeneous ensembles and neural network architectures. Experimental results verify that the proposed approach can significantly improve the learning accuracy within a limited hyper-parameter sample budget.

  • Research Article
  • 10.17721/2706-9699.2025.1.07
Derivative-free optimization for custom loss functions
  • Jan 1, 2025
  • Journal of Numerical and Applied Mathematics
  • O S Maistrenko

Derivative-free optimization (DFO) has emerged as a powerful technique for solving optimization problems where the gradient of the objective function is either unavailable, expensive to compute, or non-smooth. This article explores the application of DFO methods to optimize custom loss functions in machine learning and other fields. The paper also highlights the challenges and potential improvements in the current DFO approaches, offering insights for further research and practical applications.

  • Conference Article
  • Cite Count Icon 4
  • 10.1115/detc2021-70505
A New Triangle: Fractional Calculus, Renormalization Group, and Machine Learning
  • Aug 17, 2021
  • Haoyu Niu + 3 more

The emergence of the systematic study of complexity as a science has resulted from the growing recognition that the fundamental assumptions upon which Newtonian physics is based are not satisfied throughout most of science, e.g., time is not necessarily uniformly flowing in one direction, nor is space homogeneous. Herein we discuss how the fractional calculus (FC), renormalization group (RG) theory and machine learning (ML) have each developed independently in the study of distinct phenomena in which one or more of the underlying assumptions of Newtonian formalism is violated. FC has been shown to help us better understand complex systems, improve the processing of complex signals, enhance the control of complex networks, increase optimization performance, and even extend the enabling of the potential for creativity. RG allows one to investigate the changes of a dynamical system at different scales. For example, in quantum field theory, divergent parts of a calculation can lead to nonsensical infinite results. However, by applying RG, the divergent parts can be adsorbed into fewer measurable quantities, yielding finite results. To date, ML is a fashionable research topic and will probably remain so into the foreseeable future. How a model can learn efficiently (optimally) is always essential. The key to learnability is designing efficient optimization methods. Although extensive research has been carried out on the three topics separately, few studies have investigated the association triangle between the FC, RG, and ML. To initiate the study of their interdependence, herein the authors discuss the critical connections between them (Fig. 1). In the FC and RG, scaling laws reveal the complexity of the phenomena discussed. The authors emphasize that the FC’s and RG’s critical connection is the form of inverse power laws (IPL), and the IPL index provides a measure of the level of complexity. For FC and ML, the critical connections in big data, wherein variability, optimization, and non-local models are described. The authors introduce the derivative-free and gradient-based optimization methods and explain how the FC could contribute to these study areas. In the end, the association between the RG and ML is also explained. The mutual information, feature extraction, and locality are also discussed. Many of the cross-sectional studies suggest a connection between the RG and ML. The RG has a superficial similarity to deep neural networks (DNNs) structure in which one marginalizes over hidden degrees of freedom. The authors remark in the conclusions that the association triangle between FC, RG, and ML, form a stool on which the foundation to complexity science might comfortably sit for a wide range of future research topics.

  • Research Article
  • Cite Count Icon 20
  • 10.1007/s11432-021-3416-y
ZOOpt: a toolbox for derivative-free optimization
  • Sep 22, 2022
  • Science China Information Sciences
  • Yu-Ren Liu + 4 more

Recent advances in derivative-free optimization allow efficient approximation of the global-optimal solutions of sophisticated functions, such as functions with many local optima, non-differentiable and non-continuous functions. This article describes the ZOOpt (Zeroth Order Optimization) toolbox that provides efficient derivative-free solvers and is designed easy to use. ZOOpt provides single-machine parallel optimization on the basis of python core and multi-machine distributed optimization for time-consuming tasks by incorporating with the Ray framework -- a famous platform for building distributed applications. ZOOpt particularly focuses on optimization problems in machine learning, addressing high-dimensional and noisy problems such as hyper-parameter tuning and direct policy search. The toolbox is maintained toward a ready-to-use tool in real-world machine learning tasks.

  • Research Article
  • Cite Count Icon 20
  • 10.46632/daai/2/2/7
Nelder–Mead Simplex Search Method - A Study
  • Aug 1, 2022
  • Data Analytics and Artificial Intelligence
  • Manjula Selvam + 2 more

Nelder-Mead in n dimensions facilitates the set of n + 1 test points. It finds a new test point, Makes one of the old test points new, and so the technique progresses into objective behavior process is measured at each test point. The Nelder-Mead Simplex system uses Simplex to find the minimum space. The algorithm operates using a design framework with n + 1 points (called simplex), Where n is for simplex based operation Number of input dimensions. The Nelder-Mead method is one of the most popular non-derivative methods, using only the values of f to search. Only in the simplex formation of n + 1 will the points / moving / contraction of this simplex be in a positive direction. Strictly speaking, Nelder-Mead is not a truly universal optimization algorithm; however, in it works reasonably well for many non-local problems. Direct search is the gradient of the objective process Optimization is a method for solving problems that require no information. All of the points approaching an optimal point Pattern search algorithms that calculate the sequence. The existence of local trust is a key factor in defining the difficulty of the global optimization problem because it is relatively easy to improve locally and relatively difficult to improve locally. Slope Descent is an optimal method Machine learning models and to train neurological networks commonly used. Training data these models allow learning over time, and pricing function is particularly active in gradient descent. The barometer is an optimization algorithm that measures its accuracy at each parameter update and can be repeated by comparing optimal or different solutions. A satisfactory the solution will be found. With the advent of computers, optimization has become of computer aided design activities has become a part of Gradient Decent (GT) is a functional first-order upgrade algorithm Local minimum of the given function and Used to determine the maximum. This method is commonly used to reduce cost / loss performance in machine learning (ML) and deep learning (DL). The problem with finding optimal points in such situations is referred to as derivative-free optimization, and algorithms that do not use derivatives or defined variants are called derivative-free algorithms.

  • Research Article
  • Cite Count Icon 30
  • 10.1609/aaai.v33i01.33013846
Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion
  • Jul 17, 2019
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Yi-Qi Hu + 5 more

Automatic machine learning (AutoML) aims at automatically choosing the best configuration for machine learning tasks. However, a configuration evaluation can be very time consuming particularly on learning tasks with large datasets. This limitation usually restrains derivative-free optimization from releasing its full power for a fine configuration search using many evaluations. To alleviate this limitation, in this paper, we propose a derivative-free optimization framework for AutoML using multi-fidelity evaluations. It uses many lowfidelity evaluations on small data subsets and very few highfidelity evaluations on the full dataset. However, the lowfidelity evaluations can be badly biased, and need to be corrected with only a very low cost. We thus propose the Transfer Series Expansion (TSE) that learns the low-fidelity correction predictor efficiently by linearly combining a set of base predictors. The base predictors can be obtained cheaply from down-scaled and experienced tasks. Experimental results on real-world AutoML problems verify that the proposed framework can accelerate derivative-free configuration search significantly by making use of the multi-fidelity evaluations.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1016/b978-0-323-85159-6.50260-8
Generation and Benefit of Surrogate Models for Blackbox Chemical Flowsheet Optimization
  • Jan 1, 2022
  • Computer Aided Chemical Engineering
  • Tim Janus + 2 more

Generation and Benefit of Surrogate Models for Blackbox Chemical Flowsheet Optimization

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/mwscas.2019.8884831
Machine Learning Based Image Calibration for a Twofold Time-Interleaved High Speed DAC
  • Aug 1, 2019
  • Daniel Beauchamp + 1 more

In this paper, we propose a novel image calibration algorithm for a twofold time-interleaved DAC (TIDAC). The algorithm is based on simulated annealing, which is often used in the field of machine learning to solve derivative-free optimization (DFO) problems. The digital-to-analog converter (DAC) under consideration is part of a digital transceiver core that contains a high speed analog-to-digital converter (ADC), microcontroller, and digital control via a Serial Peripheral Interface (SPI). These are used as tools for designing an algorithm which suppresses the interleave image to the noise floor. The algorithm is supported with experimental results in silicon on a 10-bit twofold TIDAC operating at a sample rate of 50 GS/s in 14nm CMOS technology.

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.compchemeng.2020.106763
Industrial text analytics for reliability with derivative-free optimization
  • Feb 3, 2020
  • Computers & Chemical Engineering
  • Tong Zhang + 5 more

Maintenance work order records provide valuable insights into chemical plants and production efficiency. These records are manually created in computerized management systems for routine and emergency maintenance. However, since the records are manually created, recording errors are not uncommon. The resulting datasets are additionally imbalanced, i.e., they have significantly more instances of certain classes than other minority classes. It is very challenging to use such datasets for classification and prediction of future events. In this paper, we propose a modeling framework that uses derivative-free optimization (DFO) to optimize the performance of classification models based on datasets that may be imbalanced. We apply our modeling framework to 15 real-world work order datasets. We also evaluate ten mixed-integer box-bounded DFO solvers for their ability to optimize machine learning models from industrial datasets. Compared to standard solutions, our results show dramatic improvements in the prediction accuracies of the models.

  • Research Article
  • Cite Count Icon 38
  • 10.1016/j.energy.2020.117136
Feature engineering and forecasting via derivative-free optimization and ensemble of sequence-to-sequence networks with applications in renewable energy
  • Feb 17, 2020
  • Energy
  • Mohammad Pirhooshyaran + 2 more

Feature engineering and forecasting via derivative-free optimization and ensemble of sequence-to-sequence networks with applications in renewable energy

  • Research Article
  • Cite Count Icon 2
  • 10.1109/xloop54565.2021.00009
High-Performance Hybrid-Global-Deflated-Local Optimization with Applications to Active Learning.
  • Nov 1, 2021
  • Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing : XLOOP. Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing
  • Marcus Michael Noack + 3 more

Mathematical optimization lies at the core of many science and industry applications. One important issue with many current optimization strategies is a well-known trade-off between the number of function evaluations and the probability to find the global, or at least sufficiently high-quality local optima. In machine learning (ML), and by extension in active learning - for instance for autonomous experimentation - mathematical optimization is often used to find the underlying uncertain surrogate model from which subsequent decisions are made and therefore ML relies on high-quality optima to obtain the most accurate models. Active learning often has the added complexity of missing offline training data; therefore, the training has to be conducted during the data collection which can stall the acquisition if standard methods are used. In this work, we highlight recent efforts to create a high-performance hybrid optimization algorithm (HGDL), combining derivative-free global optimization strategies with local, derivative-based optimization, ultimately yielding an ordered list of unique local optima. Redundancies are avoided by deflating the objective function around earlier encountered optima. HGDL is designed to take full advantage of parallelism by having the most computationally expensive process, the local first and second-order-derivative-based optimizations, run in parallel on separate compute nodes in separate processes. In addition, the algorithm runs asynchronously; as soon as the first solution is found, it can be used while the algorithm continues to find more solutions. We apply the proposed optimization and training strategy to Gaussian-Process-driven stochastic function approximation and active learning.

  • Single Report
  • 10.21236/ada622645
Derivative Free Optimization of Complex Systems with the Use of Statistical Machine Learning Models
  • Sep 12, 2015
  • Katya Scheinberg

: This project focused on development of novel derivative free optimization methods that rely on recent techniques and models from statistical learning. The main idea of these methods is to build local models of the objective function from randomly sampled data points. This approach has many benefits, in that it allows us to construct fairly accurate models with relatively small number of samples. The key difference with the deterministic sampling approaches is that these accurate models are constructed with some high probability, but not always. Moreover, it is not known, when these models are accurate. Only the probability of an accurate model occurring is known. Under these conditions, novel convergence theory needed to be developed, which has been the focus of our research.

  • Book Chapter
  • 10.1016/b978-0-323-85159-6.50211-6
Surrogate Modeling for Superstructure Optimization with Generalized Disjunctive Programming
  • Jan 1, 2022
  • Computer Aided Chemical Engineering
  • H.A Pedrozo + 4 more

Surrogate Modeling for Superstructure Optimization with Generalized Disjunctive Programming

  • Research Article
  • Cite Count Icon 55
  • 10.2139/ssrn.3344332
Financial Applications of Gaussian Processes and Bayesian Optimization
  • Mar 12, 2019
  • SSRN Electronic Journal
  • Joan Gonzalvez + 3 more

Financial Applications of Gaussian Processes and Bayesian Optimization

  • Research Article
  • 10.70470/khwarizmia/2026/001
Solving the Heat Conduction Equation Using Butterfly Algorithm Guided by Machine Learning
  • Jan 10, 2026
  • KHWARIZMIA
  • Habeeb Al-Thabhawee + 2 more

In this paper, we propose a hybrid computational framework based on a machine-learning model for the one-dimensional transient heat conduction equation, where the hybrid model is optimized using the Butterfly Optimization Algorithm (BOA). Rather than using classical numerical discretization techniques, the temperature distribution is modeled by a feed-forward neural network capable of learning the spaio-temporal relation of space and time coordinates. Inspired by its potential as a derivative-free metaheuristic optimizer, BOA guides the training process to approximate the best parameters of the network, within physical constraints. We use a hybrid loss formulation that satisfies boundary conditions and initial conditions, interior supervised temperature samples, and the governing heat-equation residual to maintain numerical accuracy and consistency with physical principles. This BOA–ML solver is then assessed by representative simulations of fundamental heat conduction problems, with results provided in the form of convergence curves, temperature-field heatmaps, and absolute errors distributions. The results from the various simulation cases validate that on the one hand the proposed combination of data-driven and physics-based learning approach leverages the stable temperature prediction capability of the hybrid BOA-guided learning approach, which consistently yields smaller approximation errors than purely data-driven and purely physics-based training variants. The new proposed method may serve as a flexible alternative for heat-transfer modeling, and extensions to higher-dimensional conduction problems and non-simplified thermal boundary conditions are possible directions for future work.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant