In many industrial applications, data-driven models are more and more commonly employed as an alternative to classical analytical descriptions or simulations. In particular, such models are often used to predict the outcome of an industrial process with respect to specific quality characteristics from both observed process parameters and control variables. A major step in proceeding from purely predictive to prescriptive analytics, i.e., towards leveraging data-driven models for process optimization, consists of, for given process parameters, determining control variable values such that the output quality improves according to the process model. This task naturally leads to a constrained optimization problem for data-driven prediction algorithms. In many cases, however, the best available models suffer from a lack of regularity: methods such as gradient boosting or random forests are generally non-differentiable and might even exhibit discontinuities. The optimization of these models would therefore require the use of derivative-free techniques. Here, we discuss the use of alternative, independently trained differentiable machine learning models as a surrogate during the optimization procedure. While these alternatives are generally less accurate representations of the actual process, the possibility of employing derivative-based optimization methods provides major advantages in terms of computational performance. Using classical benchmarks as well as a real-world dataset obtained from an industrial environment, we demonstrate that these advantages can outweigh the additional model error, especially in real-time applications.