Depth without distress

Upgrading a linear classifier to a nonlinear one leads to statistical or computational compromise. For example, lifting its dimension, as in kernel methods, may lead to overfitting. Nonconvex optimization algorithms, used for (deep) neural networks or decision trees, may not obtain high-quality solutions or even converge. Similarly, boosting can get stuck… CONTINUE READING