# MACRO: A Meta-Algorithm for Conditional Risk Minimization

@article{Zimin2018MACROAM, title={MACRO: A Meta-Algorithm for Conditional Risk Minimization}, author={Alexander Zimin and Christoph H. Lampert}, journal={arXiv: Machine Learning}, year={2018} }

We study conditional risk minimization (CRM), i.e. the problem of learning a hypothesis of minimal risk for prediction at the next step of sequentially arriving dependent data. Despite it being a fundamental problem, successful learning in the CRM sense has so far only been demonstrated using theoretical algorithms that cannot be used for real problems as they would require storing all incoming data. In this work, we introduce MACRO, a meta-algorithm for CRM that does not suffer from this…

## Figures and Tables from this paper

## References

SHOWING 1-10 OF 34 REFERENCES

Conditional Risk Minimization for Stochastic Processes

- Computer ScienceArXiv
- 2015

A practical estimator for the conditional risk based on the theory of non-parametric time-series prediction, and a finite sample concentration bound that establishes uniform convergence of the estimator to the true conditional risk under certain regularity assumptions on the process.

Learning Theory for Conditional Risk Minimization

- Computer ScienceAISTATS
- 2017

The main results are two theorems that establish criteria for learnability for many classes of stochastic processes, including all special cases studied previously in the literature.

Online Learning with Prior Knowledge

- Computer ScienceCOLT
- 2007

The standard so-called experts algorithms are methods for utilizing a given set of "experts" to make good choices in a sequential decision-making problem by allowing an experts algorithm to rely on state information, namely, partial information about the cost function, which is revealed to the decision maker before the latter chooses an action.

Predictive PAC Learning and Process Decompositions

- Computer ScienceNIPS
- 2013

It is argued that it is natural in predictive PAC to condition not on the past observations but on the mixture component of the sample path, and a novel PAC generalization bound for mixtures of learnable processes with a generalization error that is not worse than that of each mixture component.

Optimal learning with Bernstein online aggregation

- Computer ScienceMachine Learning
- 2016

This work introduces a new recursive aggregation procedure called Bernstein Online Aggregation (BOA), which is optimal for the model selection aggregation problem in the bounded iid setting for the square loss and is the first online algorithm that satisfies the fast rate of convergence.

Time series prediction and online learning

- Computer ScienceCOLT
- 2016

The first generalization bounds for a hypothesis derived by online-to-batch conversion of the sequence of hypotheses output by an online algorithm are proved, in the general setting of a non-stationary non-mixing stochastic process.

On the generalization ability of on-line learning algorithms

- Computer ScienceIEEE Transactions on Information Theory
- 2004

This paper proves tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble, and obtains risk tail bounds for kernel perceptron algorithms interms of the spectrum of the empirical kernel matrix.

Prediction of time series by statistical learning: general losses and fast rates

- Mathematics
- 2012

Abstract We establish rates of convergences in statistical learning for time series forecasting. Using the PAC-Bayesian approach, slow rates of convergence √ d/n for the Gibbs estimator under the…

Predictive PAC Learnability: A Paradigm for Learning from Exchangeable Input Data

- Mathematics2010 IEEE International Conference on Granular Computing
- 2010

Using de Finetti's theorem, it is shown that if a universally separable function class $\mathscr F$ is distribution-free PAC learnable under i.i.d. inputs, then it is Distribution-free predictive PAC learnability under exchangeable inputs, with a slightly worse sample complexity.

Stability Bounds for Stationary φ-mixing and β-mixing Processes

- Computer Science
- 2009

Novel and distinct stability-based generalization bounds for stationary φ-mixing and β- Mixing sequences are proved, which can be viewed as the first theoretical basis for the use of these algorithms in non-i.i.d. scenarios.