Abstract

Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models are not evaluated outside their validity domain during process optimization. We propose a method to learn this validity domain and encode it as constraints in process optimization. We first perform a topological data analysis using persistent homology identifying potential holes or separated clusters in the training data. In case clusters or holes are identified, we train a one-class classifier, i.e., a one-class support vector machine, on the training data domain and encode it as constraints in the subsequent process optimization. Otherwise, we construct the convex hull of the data and encode it as constraints. We finally perform deterministic global process optimization with the data-driven models subject to their respective validity constraints. To ensure computational tractability, we develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3000, making it a viable tool for engineering applications. The method is ready-to-use and available open-source as part of our MeLOn toolbox (https://git.rwth-aachen.de/avt.svt/public/MeLOn).

Highlights

  • Supervised machine-learning techniques have been re-emerging as a promising avenue for data-driven modeling in various engineering disciplines (Venkatasubramanian 2019)

  • We develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3000, making it a viable tool for engineering applications

  • Note that we refer to the validity domain of individual data-driven models throughout this work, but the concept can be applied to hybrid models (Kahrs and Marquardt 2007)

Read more

Summary

Introduction

Supervised machine-learning techniques have been re-emerging as a promising avenue for data-driven modeling in various engineering disciplines (Venkatasubramanian 2019). The vast majority of previous publications use box constraints (i.e., hyperrectangles) to bound the inputs of data-driven models, i.e., each variable has independent bounds This approach is practical when the training data is obtained from simulations based on regular grids or Latin hypercubes that are sufficiently dense. As proposed by Courrieu (1994), a few previous works in process systems engineering (PSE) constructed the convex hull of the training data points to describe the validity domain and integrated it as a set of linear constraints in optimization problems (Kahrs and Marquardt 2007; Zhang et al 2016; Asprion et al 2019). In case clusters or holes are identified, we train a one-class SVM on the training data domain of the data-driven models and encode it as constraints in the subsequent process optimization. We demonstrate the potential of our method on a set of illustrative mathematical case studies and an engineering case study, i.e., the open-loop control of a sulfur recovery unit

Methodology
Topological data analysis using persistent homology
Learn validity domain using one-class support vector machines
Optimization with classifier as constraint
Illustrative case studies
Topological data analysis
Validity domain modeling
Engineering application
Conclusion
Compliance with ethical standards
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call