Scoring interval forecasts: Equal-tailed, shortest, and modal interval

@article{Brehmer2020ScoringIF,
  title={Scoring interval forecasts: Equal-tailed, shortest, and modal interval},
  author={Jonas Brehmer and Tilmann Gneiting},
  journal={Bernoulli},
  year={2020}
}
We consider different types of predictive intervals and ask whether they are elicitable, i.e. are unique minimizers of a loss or scoring function in expectation. The equal-tailed interval is elicitable, with a rich class of suitable loss functions, though subject to either translation invariance, or positive homogeneity and differentiability, the Winkler interval score becomes a unique choice. The modal interval also is elicitable, with a sole consistent scoring function, up to equivalence… 

Tables from this paper

Forecast evaluation of quantiles, prediction intervals, and other set-valued functionals

We introduce a theoretical framework of elicitability and identifiability of set-valued functionals, such as quantiles, prediction intervals, and systemic risk measures. A functional is elicitable if

On the Aggregation of Probability Assessments: Regularized Mixtures of Predictive Densities for Eurozone Inflation and Real Interest Rates

We propose methods for constructing regularized mixtures of density forecasts. We explore a variety of objectives and regularization penalties, and we use them in a substan-tive exploration of

Comparing Confidence Intervals for a Binomial Proportion with the Interval Score

TLDR
This work evaluates eleven CIs for the binomial proportion based on the expected interval score and proposes a summary measure which can take into account different weighting of the underlying true proportion.

Bayes risk, elicitability, and the Expected Shortfall

Motivated by recent advances on elicitability of risk measures and practical considerations of risk optimization, we introduce the notions of Bayes pairs and Bayes risk measures. Bayes risk measures

The Efficiency Gap

Parameter estimation via M- and Z-estimation is broadly considered to be equally powerful in semiparametric models for one-dimensional functionals. This is due to the fact that, under sufficient

Censored Density Forecasts: Production and Evaluation

This paper develops methods for the production and evaluation of censored density forecasts. Censored density forecasts quantify forecast risks in a middle region of the density covering a specified

Model Comparison and Calibration Assessment: User Guide for Consistent Scoring Functions in Machine Learning and Actuarial Practice

TLDR
This user guide revisits statistical techniques to assess the calibration or adequacy of a model on the one hand, and to compare and rank different models on the other hand, emphasising the importance of specifying the prediction target functional at hand a priori.

Evaluating epidemic forecasts in an interval format

TLDR
This article discusses the computation and interpretation of the weighted interval score, which can be interpreted as a generalization of the absolute error to probabilistic forecasts and allows for a decomposition into a measure of sharpness and penalties for over- and underprediction.

Combining probabilistic forecasts of COVID-19 mortality in the United States

Forecasting: theory and practice

References

SHOWING 1-10 OF 43 REFERENCES

On the indirect elicitability of the mode and modal interval

TLDR
It is shown that this cannot be done: Neither the mode nor a modal interval is indirectly elicitable with respect to the class of identifiable functionals.

On the Comparison of Interval Forecasts

We explore interval forecast comparison when the nominal confidence level is specified, but the quantiles on which intervals are based are not specified. It turns out that the problem is difficult,

Why scoring functions cannot assess tail properties

Motivated by the growing interest in sound forecast evaluation techniques with an emphasis on distribution tails rather than average behaviour, we investigate a fundamental question arising in this

Forecast evaluation of quantiles, prediction intervals, and other set-valued functionals

We introduce a theoretical framework of elicitability and identifiability of set-valued functionals, such as quantiles, prediction intervals, and systemic risk measures. A functional is elicitable if

Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings

In the practice of point prediction, it is desirable that forecasters receive a directive in the form of a statistical functional. For example, forecasters might be asked to report the mean or a

Higher order elicitability and Osband’s principle

A statistical functional, such as the mean or the median, is called elicitable if there is a scoring function or loss function such that the correct forecast of the functional is the unique minimizer

Forecast Evaluation of Set-Valued Functionals

A functional is elicitable (identifiable) if it is the unique minimiser (zero) of an expected scoring function (identification function). Elicitability and identifiability are essential for forecast

Strictly Proper Scoring Rules, Prediction, and Estimation

TLDR
The theory of proper scoring rules on general probability spaces is reviewed and developed, and the intuitively appealing interval score is proposed as a utility function in interval estimation that addresses width as well as coverage.

Elicitation complexity of statistical properties

TLDR
This work lays the foundation for a general theory of elicitation complexity, including several basic results about how elicit complexity behaves, and the complexity of standard properties of interest.

Eliciting properties of probability distributions

We investigate the problem of truthfully eliciting an expert's assessment of a property of a probability distribution, where a property is any real-valued function of the distribution such as mean or