Abstract

In this paper, we solve the well-known symbolic regression problem that has been intensively studied and has a wide range of applications. To solve it, we propose an efficient metaheuristic-based approach, called RILS-ROLS. RILS-ROLS is based on the following two elements: (i) iterated local search, which is the method backbone, mainly solving combinatorial and some continuous aspects of the problem; (ii) ordinary least squares method, which focuses on the continuous aspect of the search space—it efficiently determines the best—fitting coefficients of linear combinations within solution equations. In addition, we introduce a novel fitness function that combines important model quality measures: R^2 score, RMSE score, size of the model (or model complexity), and carefully designed local search, which allows systematic search in proximity to candidate solution. Experiments are conducted on the two well-known ground-truth benchmark sets from literature: Feynman and Strogatz. RILS-ROLS was compared to 14 other competitors from the literature. Our method outperformed all 14 competitors with respect to the symbolic solution rate under varying levels of noise. We observed the robustness of the method with respect to noise, as the symbolic solution rate decreases relatively slowly with increasing noise. Statistical analysis of the obtained experimental results confirmed that RILS-ROLS is a new state-of-the-art method for solving the problem of symbolic regression on datasets whose target variable is modelled as a closed-form equation with allowed operators. In addition to evaluation on known ground-truth datasets, we introduced a new randomly generated set of problem instances. The goal of this set of instances was to test the sensitivity of our method with respect to incremental equation sizes under different levels of noise. We have also proposed a parallelized extension of RILS-ROLS that has proven adequate in solving several very large instances with 1 million records and up to 15 input variables.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call