# Contemporary Symbolic Regression Methods and their Relative Performance

@article{Cava2021ContemporarySR, title={Contemporary Symbolic Regression Methods and their Relative Performance}, author={W. L. Cava and Patryk Orzechowski and Bogdan Burlacu and Fabr'icio Olivetti de Francca and M. Virgolin and Ying Jin and Michael Kommenda and Jason H. Moore}, journal={ArXiv}, year={2021}, volume={abs/2107.14351} }

Many promising approaches to symbolic regression have been presented in recent years, yet progress in the field continues to suffer from a lack of uniform, robust, and transparent benchmarking standards. We address this shortcoming by introducing an open-source, reproducible benchmarking platform for symbolic regression. We assess 14 symbolic regression methods and 7 machine learning methods on a set of 252 diverse regression problems. Our assessment includes both real-world datasets with no…

## Figures and Tables from this paper

## 30 Citations

### GSR: A Generalized Symbolic Regression Approach

- Computer ScienceArXiv
- 2022

This paper presents GSR, a Generalized Symbolic Regression approach, by modifying the conventional SR optimization problem formulation, while keeping the main SR objective intact, and proposes a genetic programming approach with a matrix-based encoding scheme.

### Symbolic Regression is NP-hard

- Computer ScienceArXiv
- 2022

Symbolic regression (SR) is the task of learning a model of data in the form of a mathematical expression. By their nature, SR models have the potential to be accurate and human-interpretable at the…

### End-to-end symbolic regression with transformers

- Computer ScienceArXiv
- 2022

This paper challenges this two-step procedure, and task a Transformer to directly predict the full mathematical expression, constants included, and presents ablations to show that this end-to-end approach yields better results, sometimes even without the refinement step.

### SciMED: A Computational Framework For Physics-Informed Symbolic Regression with Scientist-In-The-Loop

- Computer ScienceArXiv
- 2022

A novel, open-source computational framework called Scientist-Machine Equation Detector (SciMED), which integrates scientiﬁc discipline wisdom in a scientist-in- the-loop approach with state-of-the-art symbolic regression (SR) methods.

### Transformation-interaction-rational representation for symbolic regression

- Computer ScienceProceedings of the Genetic and Evolutionary Computation Conference
- 2022

An extension to this representation, called Transformation-Interaction-Rational representation, is proposed that defines a new function form as the rational of two Interaction-Transformation functions, and the target variable can also be transformed with an univariate function.

### Transformation-Interaction-Rational Representation for Symbolic Regression

- Computer ScienceGECCO
- 2022

An extension to this representation, called Transformation-Interaction-Rational representation, is proposed that defines a new function form as the rational of two Interaction-Transformation functions, and the target variable can also be transformed with an univariate function.

### On genetic programming representations and fitness functions for interpretable dimensionality reduction

- Computer ScienceGECCO
- 2022

It is found that various GP methods can be competitive with state-of-the-art DR algorithms and that they have the potential to produce interpretable DR mappings.

### Genetic programming, standardisation, and stochastic gradient descent revisited: initial findings on SRBench

- Computer ScienceGECCO Companion
- 2022

A recalibrated variant of GPZGD is introduced and its performance within the recently-proposed SRBench framework is tested: the resulting variant demonstrates excellent performance relative to existing symbolic regression methods.

### Evolvability degeneration in multi-objective genetic programming for symbolic regression

- Computer ScienceGECCO
- 2022

A new version of NSGA-II is extended to track, over time, the evolvability of models of different levels of complexity, and it is found that the over-replication of low complexity-models is due to a lack of evolVability, i.e., the inability to produce offspring with improved accuracy.

### Multi-modal multi-objective model-based genetic programming to find multiple diverse high-quality models

- Computer ScienceGECCO
- 2022

A novel multi-modal multi-tree multi-objective GP approach that extends a modern model-based GP algorithm known as GP-GOMEA that is already effective at searching for small expressions.

## References

SHOWING 1-10 OF 95 REFERENCES

### Benchmarking state-of-the-art symbolic regression algorithms

- Computer ScienceGenetic Programming and Evolvable Machines
- 2020

This paper conceptually and experimentally compare several representatives of multiple linear regression algorithms, including GPTIPS, FFX, and EFS, which are applied as off-the-shelf, ready-to-use techniques in the field of SR.

### Where are we now?: a large benchmark study of recent symbolic regression methods

- Computer ScienceGECCO
- 2018

The results suggest that symbolic regression performs strongly compared to state-of-the-art gradient boosting algorithms, although in terms of running times is among the slowest of the available methodologies.

### FFX: Fast, Scalable, Deterministic Symbolic Regression Technology

- Computer Science
- 2011

A new non-evolutionary technique for symbolic regression that is orders of magnitude faster than competent GP approaches on real-world problems, returns simpler models, has comparable or better prediction on unseen data, and converges reliably and deterministically.

### Improving Model-Based Genetic Programming for Symbolic Regression of Small Expressions

- Computer ScienceEvolutionary Computation
- 2021

This article shows that the non-uniformity in the distribution of the genotype in GP populations negatively biases LL, and proposes a method to correct this, and finds that GOMEA is a promising new approach to SR.

### Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients

- Computer ScienceICLR
- 2021

The proposed framework uses a recurrent neural network to emit a distribution over tractable mathematical expressions, and employs reinforcement learning to train the network to generate better-fitting expressions, which significantly outperforms standard genetic programming-based symbolic regression in its ability to exactly recover symbolic expressions.

### Pareto-Front Exploitation in Symbolic Regression

- Computer Science
- 2005

This work prefers parsimonious (simple) expressions with the expectation that they are more robust with respect to changes over time in the underlying system or extrapolation outside the range of the data used as the reference in evolving the symbolic regression.

### Feature standardisation and coefficient optimisation for effective symbolic regression

- Computer ScienceGECCO
- 2020

It is demonstrated that standardisation allows a simpler function set to be used without increasing bias and can significantly improve the performance of coefficient optimisation through gradient descent to produce accurate models.

### AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

- Computer Science, MathematicsNeurIPS
- 2020

We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal, in the sense of having the best accuracy for a given complexity. It improves on the…

### AI Feynman: A physics-inspired method for symbolic regression

- Physics, Computer ScienceScience Advances
- 2020

This work develops a recursive multidimensional symbolic regression algorithm that combines neural network fitting with a suite of physics-inspired techniques and improves the state-of-the-art success rate.

### Bayesian Symbolic Regression

- Computer Science
- 2019

The proposed BSR(Bayesian Symbolic Regression) method saves computer memory with no need to keep an updated 'genome pool', and numerical experiments show that, compared with GP, the solutions of BSR are closer to the ground truth and the expressions are more concise.