Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Derivative Free Optimization of Complex Systems with the Use of Statistical Machine Learning Models

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Abstract : This project focused on development of novel derivative free optimization methods that rely on recent techniques and models from statistical learning. The main idea of these methods is to build local models of the objective function from randomly sampled data points. This approach has many benefits, in that it allows us to construct fairly accurate models with relatively small number of samples. The key difference with the deterministic sampling approaches is that these accurate models are constructed with some high probability, but not always. Moreover, it is not known, when these models are accurate. Only the probability of an accurate model occurring is known. Under these conditions, novel convergence theory needed to be developed, which has been the focus of our research.

Similar Papers
  • Research Article
  • Cite Count Icon 6
  • 10.1088/1742-6596/2620/1/012007
A Derivative-free Trust-region Method for Optimization on the Ellipsoid
  • Oct 1, 2023
  • Journal of Physics: Conference Series
  • Pengcheng Xie

Optimization methods play a crucial role in various fields and applications. In some optimization problems, the derivative information of the objective function is unavailable. Such black-box optimization problems need to be solved by derivative-free optimization methods. At the same time, optimization problems with ellipsoidal constraints are important and have widespread applications in various fields as well. Following the development of the late professor M. J. D. Powell’s efficient derivative-free trust-region optimization methods, this paper considers solving derivative-free optimization problems on the ellipsoid. Our new optimization solver EC-NEWUOA for problems on the ellipsoid in ℜ n is designed based on Powell’s derivative-free software NEWUOA for unconstrained optimization problems. The proposed techniques for our new method mainly include using the Courant penalty function, the augmented Lagrangian method, and the projection technique. Details about the method and theoretical analysis are included in this paper. We also compare our new method with other algorithms by solving test problems and then show the numerical advantages of our new method.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/mlcad52597.2021.9531234
Using Deep Neural Networks And Derivative Free Optimization To Accelerate Coverage Closure
  • Aug 30, 2021
  • Raviv Gal + 5 more

In computer aided design (CAD), a core task is to optimize the parameters of noisy simulations. Derivative free optimization (DFO) methods are the most common choice for this task. In this paper, we show how four DFO methods, specifically implicit filtering (IF), simulated annealing (SA), genetic algorithms (GA), and particle swarm (PS), can be accelerated using a deep neural network (DNN) that acts as a surrogate model of the objective function. In particular, we demonstrate the applicability of the DNN accelerated DFO approach to the coverage directed generation (CDG) problem that is commonly solved by hardware verification teams.

  • Research Article
  • Cite Count Icon 2
  • 10.7498/aps.63.149203
Numerical model error estiamtion by derivative-free optimization method
  • Jan 1, 2014
  • Acta Physica Sinica
  • Huang Qi-Can + 5 more

Initial error and model error are key factors restricting the accuracy of numerical weather prediction (NWP). The purpose of the present study is to estimate the errors of spatiotemporal evolution model by using recent observations. By considering the continuous evolution of atmosphere, the observed data (ignoring the measurement error) can be viewed as a series of solutions of accurate model governing the actual atmosphere, and the model errors can be objectively assumed to be an unknown functional term (a missing forcing term) of the numerical model, thus the NWP can be considered as an inverse problem to uncover the unknown model error term by using the long periods of observed data. In this study, we first construct an inverse problem model with its optimization problem, which is constrained by the numerical model, to estimate the errors of spatiotemporal evolution model, then we present a derivative-free optimization (DFO) method to find the minimum solution of the optimization problem by running the numerical model with an external forcing term. The DFO method does not need to compute the gradient of the objective functional and the tangent linear model or adjoint model of the original numerical model. The numerical study of Burgers equation indicates that the presented methods can effectively uncover the model errors from the past data and evidently improve the numerical prediction. The precedures described in this paper open up possibilities for utilizing the past observation data to extract useful information about model errors and enhance the prediction efficiency in the operational models.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 66
  • 10.1155/2020/9628957
Evaluation of Short-Term Freeway Speed Prediction Based on Periodic Analysis Using Statistical Models and Machine Learning Models
  • Jan 20, 2020
  • Journal of Advanced Transportation
  • Xiaoxue Yang + 4 more

Accurate prediction of traffic information (i.e., traffic flow, travel time, traffic speed, etc.) is a key component of Intelligent Transportation System (ITS). Traffic speed is an important indicator to evaluate traffic efficiency. Up to date, although a few studies have considered the periodic feature in traffic prediction, very few studies comprehensively evaluate the impact of periodic component on statistical and machine learning prediction models. This paper selects several representative statistical models and machine learning models to analyze the influence of periodic component on short-term speed prediction under different scenarios: (1) multi-horizon ahead prediction (5, 15, 30, 60 minutes ahead predictions), (2) with and without periodic component, (3) two data aggregation levels (5-minute and 15-minute), (4) peak hours and off-peak hours. Specifically, three statistical models (i.e., space time (ST) model, vector autoregressive (VAR) model, autoregressive integrated moving average (ARIMA) model) and three machine learning approaches (i.e., support vector machines (SVM) model, multi-layer perceptron (MLP) model, recurrent neural network (RNN) model) are developed and examined. Furthermore, the periodic features of the speed data are considered via a hybrid prediction method, which assumes that the data consist of two components: a periodic component and a residual component. The periodic component is described by a trigonometric regression function, and the residual component is modeled by the statistical models or the machine learning approaches. The important conclusions can be summarized as follows: (1) the multi-step ahead prediction accuracy improves when considering the periodic component of speed data for both three statistical models and three machine learning models, especially in the peak hours; (2) considering the impact of periodic component for all models, the prediction performance improvement gradually becomes larger as the time step increases; (3) under the same prediction horizon, the prediction performance of all models for 15-minute speed data is generally better than that for 5-minute speed data. Overall, the findings in this paper suggest that the proposed hybrid prediction approach is effective for both statistical and machine learning models in short-term speed prediction.

  • Research Article
  • 10.2118/0922-0066-jpt
Field-Development Optimization Method Benchmarked, Field Tested
  • Sep 1, 2022
  • Journal of Petroleum Technology
  • Chris Carpenter

_ This article, written by JPT Technology Editor Chris Carpenter, contains highlights of paper SPE 206267, “Benchmarking and Field Testing of the Distributed Quasi-Newton Derivative-Free Optimization Method for Field Development Optimization,” by Faruk Alpak, SPE, Yixuan Wang, SPE, and Guohua Gao, SPE, Shell, et al. The paper has not been peer reviewed. _ Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir-performance optimization problems, including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to locate multiple local optima of highly nonlinear optimization problems effectively. However, its performance has been neither validated by realistic applications nor compared with other DFO methods. Field-testing results reinforce the auspicious computational attributes of DQN. Background An optimization problem is posed as the minimization or maximization of an objective function by modifying the control variables (x) within a search domain. The objective function is a highly nonlinear function of x and may have multiple local optima. In this paper, the objective function is assumed to be twice differentiable. DFO methods can be classified into local search methods and global search methods. In the complete paper, the authors’ goal is to locate multiple local optima of the objective function. Their focus is on local search DFO optimization methods, which include direct search methods and model-based methods. Current DFO methods reviewed in the complete paper have a common feature: Only one best approximation to the solution is updated in each iteration, and only one optimal solution is identified in the last iteration. Therefore, they are referred to as single-thread DFO methods. They do not represent an efficient approach because simulation results obtained by one optimization task starting from one initial guess are not shared with other optimization tasks that start from different initial guesses. The distributed Gauss-Newton method and the DQN method benchmarked in the paper are referred to as multiple-thread optimization methods. The authors have integrated DQN into a versatile field-development optimization platform designed specifically for iterative work flows enabled through distributed-parallel flow simulations.

  • Research Article
  • Cite Count Icon 95
  • 10.1016/j.renene.2016.01.027
Comparison of Photovoltaic plant power production prediction methods using a large measured dataset
  • Jan 18, 2016
  • Renewable Energy
  • G Graditi + 2 more

Comparison of Photovoltaic plant power production prediction methods using a large measured dataset

  • Research Article
  • Cite Count Icon 16
  • 10.2118/203971-pa
An Efficient Bi-Objective Optimization Workflow Using the Distributed Quasi-Newton Method and Its Application to Well-Location Optimization
  • Dec 10, 2021
  • SPE Journal
  • Yixuan Wang + 6 more

SummaryAlthough it is possible to apply traditional optimization algorithms to determine the Pareto front of a multiobjective optimization problem, the computational cost is extremely high when the objective function evaluation requires solving a complex reservoir simulation problem and optimization cannot benefit from adjoint-based gradients. This paper proposes a novel workflow to solve bi-objective optimization problems using the distributed quasi-Newton (DQN) method, which is a well-parallelized and derivative-free optimization (DFO) method. Numerical tests confirm that the DQN method performs efficiently and robustly.The efficiency of the DQN optimizer stems from a distributed computing mechanism that effectively shares the available information discovered in prior iterations. Rather than performing multiple quasi-Newton optimization tasks in isolation, simulation results are shared among distinct DQN optimization tasks or threads. In this paper, the DQN method is applied to the optimization of a weighted average of two objectives, using different weighting factors for different optimization threads. In each iteration, the DQN optimizer generates an ensemble of search points (or simulation cases) in parallel, and a set of nondominated points is updated accordingly. Different DQN optimization threads, which use the same set of simulation results but different weighting factors in their objective functions, converge to different optima of the weighted average objective function. The nondominated points found in the last iteration form a set of Pareto-optimal solutions. Robustness as well as efficiency of the DQN optimizer originates from reliance on a large, shared set of intermediate search points. On the one hand, this set of searching points is (much) smaller than the combined sets needed if all optimizations with different weighting factors would be executed separately; on the other hand, the size of this set produces a high fault tolerance, which means even if some simulations fail at a given iteration, the DQN method’s distributed-parallel information-sharing protocol is designed and implemented such that the optimization process can still proceed to the next iteration.The proposed DQN optimization method is first validated on synthetic examples with analytical objective functions. Then, it is tested on well-location optimization (WLO) problems by maximizing the oil production and minimizing the water production. Furthermore, the proposed method is benchmarked against a bi-objective implementation of the mesh adaptive direct search (MADS) method, and the numerical results reinforce the auspicious computational attributes of DQN observed for the test problems.To the best of our knowledge, this is the first time that a well-parallelized and derivative-free DQN optimization method has been developed and tested on bi-objective optimization problems. The methodology proposed can help improve efficiency and robustness in solving complicated bi-objective optimization problems by taking advantage of model-based search algorithms with an effective information-sharing mechanism.NOTE: This paper is also published as part of the 2021 SPE Reservoir Simulation Conference Special Issue.

  • Conference Article
  • Cite Count Icon 3
  • 10.2118/203971-ms
An Efficient Bi-Objective Optimization Workflow Using the Distributed Quasi-Newton Method and Its Application to Field Development Optimization
  • Oct 19, 2021
  • Yixuan Wang + 6 more

Although it is possible to apply traditional optimization algorithms to determine the Pareto front of a multi-objective optimization problem, the computational cost is extremely high, when the objective function evaluation requires solving a complex reservoir simulation problem and optimization cannot benefit from adjoint-based gradients. This paper proposes a novel workflow to solve bi-objective optimization problems using the distributed quasi-Newton (DQN) method, which is a well-parallelized and derivative-free optimization (DFO) method. Numerical tests confirm that the DQN method performs efficiently and robustly. The efficiency of the DQN optimizer stems from a distributed computing mechanism which effectively shares the available information discovered in prior iterations. Rather than performing multiple quasi-Newton optimization tasks in isolation, simulation results are shared among distinct DQN optimization tasks or threads. In this paper, the DQN method is applied to the optimization of a weighted average of two objectives, using different weighting factors for different optimization threads. In each iteration, the DQN optimizer generates an ensemble of search points (or simulation cases) in parallel and a set of non-dominated points is updated accordingly. Different DQN optimization threads, which use the same set of simulation results but different weighting factors in their objective functions, converge to different optima of the weighted average objective function. The non-dominated points found in the last iteration form a set of Pareto optimal solutions. Robustness as well as efficiency of the DQN optimizer originates from reliance on a large, shared set of intermediate search points. On the one hand, this set of searching points is (much) smaller than the combined sets needed if all optimizations with different weighting factors would be executed separately; on the other hand, the size of this set produces a high fault tolerance. Even if some simulations fail at a given iteration, DQN’s distributed-parallel information-sharing protocol is designed and implemented such that the optimization process can still proceed to the next iteration. The proposed DQN optimization method is first validated on synthetic examples with analytical objective functions. Then, it is tested on well location optimization problems, by maximizing the oil production and minimizing the water production. Furthermore, the proposed method is benchmarked against a bi-objective implementation of the MADS (Mesh Adaptive Direct Search) method, and the numerical results reinforce the auspicious computational attributes of DQN observed for the test problems. To the best of our knowledge, this is the first time that a well-parallelized and derivative-free DQN optimization method has been developed and tested on bi-objective optimization problems. The methodology proposed can help improve efficiency and robustness in solving complicated bi-objective optimization problems by taking advantage of model-based search optimization algorithms with an effective information-sharing mechanism.

  • Research Article
  • Cite Count Icon 17
  • 10.1016/j.ijmedinf.2020.104148
Assessing reproducibility and veracity across machine learning techniques in biomedicine: A case study using TCGA data
  • May 13, 2020
  • International Journal of Medical Informatics
  • Ahyoung Amy Kim + 2 more

Assessing reproducibility and veracity across machine learning techniques in biomedicine: A case study using TCGA data

  • Research Article
  • Cite Count Icon 66
  • 10.1007/s11081-016-9307-4
Blackbox and derivative-free optimization: theory, algorithms and applications
  • Feb 1, 2016
  • Optimization and Engineering
  • Charles Audet + 1 more

Blackbox optimization refers to problems where the structure of the objective and constraint functions cannot be exploited. This is often the case when their evaluation requires the execution of a (usually time-consuming) simulation using computational models that are typically inaccessible by the user. The term Derivative-Free Optimization refers to the use of algorithms that utilize only function values because their partial derivatives are either not defined or not available; gradient approximations may sometimes be obtained, but the amount of work required to ensure they are dependable may not be worth the effort. Both blackbox and derivative-free optimization have attracted significant, and still increasing, interest from researchers over the last decade. Thus, we felt that it was time to dedicate a special issue of OPTE to this topic. Blackbox and derivative-free optimization methods are often the only realistic and practical tools available to engineers working on simulation-based design. It is obvious that if the design optimization problem at hand allows an evaluation or reliable approximation of the gradients, then efficient gradient-based methods should be used. Blackbox and derivative-free algorithms are not competitors of gradient-based methods; they are a fallback when gradient-based algorithms cannot be used. The design engineering community is increasingly becoming aware that

  • Research Article
  • Cite Count Icon 13
  • 10.1080/02331934.2012.674946
Derivative-free optimization and neural networks for robust regression
  • Dec 1, 2012
  • Optimization
  • Gleb Beliakov + 2 more

Large outliers break down linear and nonlinear regression models. Robust regression methods allow one to filter out the outliers when building a model. By replacing the traditional least squares criterion with the least trimmed squares (LTS) criterion, in which half of data is treated as potential outliers, one can fit accurate regression models to strongly contaminated data. High-breakdown methods have become very well established in linear regression, but have started being applied for non-linear regression only recently. In this work, we examine the problem of fitting artificial neural networks (ANNs) to contaminated data using LTS criterion. We introduce a penalized LTS criterion which prevents unnecessary removal of valid data. Training of ANNs leads to a challenging non-smooth global optimization problem. We compare the efficiency of several derivative-free optimization methods in solving it, and show that our approach identifies the outliers correctly when ANNs are used for nonlinear regression.

  • Conference Article
  • Cite Count Icon 12
  • 10.2118/206267-ms
Benchmarking and Field-Testing of the Distributed Quasi-Newton Derivative-Free Optimization Method for Field Development Optimization
  • Sep 15, 2021
  • Faruk Alpak + 3 more

Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir performance optimization problems including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to effectively locate multiple local optima of highly nonlinear optimization problems. However, its performance has neither been validated by realistic applications nor compared to other DFO methods. We have integrated DQN into a versatile field-development optimization platform designed specifically for iterative workflows enabled through distributed-parallel flow simulations. DQN is benchmarked against alternative DFO techniques, namely, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method hybridized with Direct Pattern Search (BFGS-DPS), Mesh Adaptive Direct Search (MADS), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). DQN is a multi-thread optimization method that distributes an ensemble of optimization tasks among multiple high-performance-computing nodes. Thus, it can locate multiple optima of the objective function in parallel within a single run. Simulation results computed from one DQN optimization thread are shared with others by updating a unified set of training data points composed of responses (implicit variables) of all successful simulation jobs. The sensitivity matrix at the current best solution of each optimization thread is approximated by a linear-interpolation technique using all or a subset of training-data points. The gradient of the objective function is analytically computed using the estimated sensitivities of implicit variables with respect to explicit variables. The Hessian matrix is then updated using the quasi-Newton method. A new search point for each thread is solved from a trust-region subproblem for the next iteration. In contrast, other DFO methods rely on a single-thread optimization paradigm that can only locate a single optimum. To locate multiple optima, one must repeat the same optimization process multiple times starting from different initial guesses for such methods. Moreover, simulation results generated from a single-thread optimization task cannot be shared with other tasks. Benchmarking results are presented for synthetic yet challenging WLO and WCO problems. Finally, DQN method is field-tested on two realistic applications. DQN identifies the global optimum with the least number of simulations and the shortest run time on a synthetic problem with known solution. On other benchmarking problems without a known solution, DQN identified compatible local optima with reasonably smaller numbers of simulations compared to alternative techniques. Field-testing results reinforce the auspicious computational attributes of DQN. Overall, the results indicate that DQN is a novel and effective parallel algorithm for field-scale development optimization problems.

  • Book Chapter
  • Cite Count Icon 6
  • 10.1016/b978-0-444-63428-3.50036-9
Derivative-Free Chemical Process Synthesis by Memetic Algorithms Coupled to Aspen Plus Process Models
  • Jan 1, 2016
  • Computer Aided Chemical Engineering
  • Maren Urselmann + 7 more

Derivative-Free Chemical Process Synthesis by Memetic Algorithms Coupled to Aspen Plus Process Models

  • Research Article
  • Cite Count Icon 7
  • 10.1186/s12889-022-14541-7
Spatial statistical machine learning models to assess the relationship between development vulnerabilities and educational factors in children in Queensland, Australia
  • Nov 30, 2022
  • BMC Public Health
  • Wala Draidi Areed + 3 more

BackgroundThe health and development of children during their first year of full time school is known to impact their social, emotional, and academic capabilities throughout and beyond early education. Physical health, motor development, social and emotional well-being, learning styles, language and communication, cognitive skills, and general knowledge are all considered to be important aspects of a child’s health and development. It is important for many organisations and governmental agencies to continually improve their understanding of the factors which determine or influence development vulnerabilities among children. This article studies the relationships between development vulnerabilities and educational factors among children in Queensland, Australia.MethodsSpatial statistical machine learning models are reviewed and compared in the context of a study of geographic variation in the association between development vulnerabilities and attendance at preschool among children in Queensland, Australia. A new spatial random forest (SRF) model is suggested that can explain more of the spatial variation in data than other approaches.ResultsIn the case study, spatial models were shown to provide a better fit compared to models that ignored the spatial variation in the data. The SRF model was shown to be the only model which can explain all of the spatial variation in each of the development vulnerabilities considered in the case study. The spatial analysis revealed that the attendance at preschool factor has a strong influence on the physical health domain vulnerability and emotional maturity vulnerability among children in their first year of school.ConclusionThis study confirmed that it is important to take into account the spatial nature of data when fitting statistical machine learning models. A new spatial random forest model was introduced and was shown to explain more of the spatial variation and provide a better model fit in the case study of development vulnerabilities among children in Queensland. At small-area population level, increased attendance at preschool was strongly associated with reduced physical and emotional development vulnerabilities among children in their first year of school.

  • Preprint Article
  • 10.5194/egusphere-egu21-2451
Towards data-driven estimates of the transient climate response to cumulative CO2 emissions using interpretable statistical learning methods
  • Mar 3, 2021
  • Katarzyna Tokarska + 2 more

<div> <div> <p>CO<sub>2</sub>-induced warming is approximately proportional to the total amount of CO<sub>2</sub> emitted. This emergent property of the climate system, known as the Transient Climate Response to cumulative CO<sub>2</sub> Emissions (TCRE), gave rise to the concept of a remaining carbon budget that specifies a cap on global CO<sub>2</sub> emissions in line with reaching a given temperature target, such as those in the Paris Agreement (e.g., Matthews et al. 2020). However, estimating the policy-relevant TCRE metric directly from the observation-based data products remains challenging due to non-CO<sub>2</sub> forcing and land-use change emissions present in the real-world climate conditions.</p> <p>Here, we present preliminary results for applying and comparing different statistical learning methods to determine TCRE (and later, remaining carbon budgets) from: (i) climate models’ output and (ii) the observational data products. First, we make use of a ‘perfect-model’ setting, i.e. using output from physics-based climate models (CMIP5 and CMIP6) under historical forcing (treated as pseudo-observations). This output is used to train different statistical-learning models, and to make predictions of TCRE (which are known from climate model simulations under CO<sub>2</sub>-only forcing, per experimental design). Next, we use such trained statistical learning models to make TCRE predictions directly from the observation-based data products.</p> <p>We also explore interpretability of the applied techniques, to determine where the statistical models are learning from, what the regions of importance are, and the key input features and weights. Explainable AI methods (e.g., McGovern et al. 2019; Molnar 2019; Samek et al. 2019) present a promising way forward in linking data-driven statistical and machine learning methods with traditional physical climate sciences, while leveraging from the large amount of data from the observational data products to provide more robust estimates of, often policy relevant, climate metrics.</p> <p>

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant