• Corpus ID: 221517003

Admissible anytime-valid sequential inference must rely on nonnegative martingales.

  title={Admissible anytime-valid sequential inference must rely on nonnegative martingales.},
  author={Aaditya Ramdas and Johannes Ruf and Martin Larsson and Wouter M. Koolen},
  journal={arXiv: Statistics Theory},
Wald's anytime-valid $p$-values and Robbins' confidence sequences enable sequential inference for composite and nonparametric classes of distributions at arbitrary stopping times, as do more recent proposals involving Vovk's `$e$-values' or Shafer's `betting scores'. Examining the literature, one finds that at the heart of all these (quite different) approaches has been the identification of composite nonnegative (super)martingales. Thus, informally, nonnegative (super)martingales are known to… 
How can one test if a binary sequence is exchangeable? Fork-convex hulls, supermartingales, and Snell envelopes
This work utilizes a geometric concept called “fork-convexity” (an adapted analog of convexity) that lies at the heart of this problem, and derives a nonnegative process that is upper bounded by a martingale, but is not itself a supermartingale.
Variance-adaptive confidence sequences by betting
This paper derives confidence intervals (CI) and time-uniform confidence sequences (CS) for an unknown mean based on bounded observations. Our methods are based on a new general approach for deriving
Sequential Estimation of Convex Divergences using Reverse Submartingales and Exchangeable Filtrations
We present a unified technique for sequential estimation of convex divergences between distributions, including integral probability metrics like the kernel maximum mean discrepancy, φ-divergences
Valid sequential inference on probability forecast performance
Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a
The Safe Log Rank Test: Error Control under Optional Stopping, Continuation and Prior Misspecification
The safe logRank test is introduced, a version of the log rank test that can retain type-I error guarantees under optional stopping and continuation and can be extended to define always-valid confidence intervals.
Trade-off between validity and efficiency of merging p-values under arbitrary dependence
Various methods of combining individual p-values into one p-value are widely used in many areas of statistical applications. We say that a combining method is valid for arbitrary dependence (VAD) if
Off-policy Confidence Sequences
This work develops confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting and provides algorithms for computing these confidence sequences that strike a good balance between computational and statistical efficiency.
p-value peeking and estimating extrema.
A pervasive issue in statistical hypothesis testing is that the reported $p$-values are biased downward by data "peeking" -- the practice of reporting only progressively extreme values of the test
Accumulation Bias: How to handle it ALL-IN
An estimated 85% of global health research investment is wasted (Chalmers and Glasziou, 2009); a total of one hundred billion US dollars in the year 2009 when it was estimated. The movement to reduce
Sequentially valid tests for forecast calibration
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously.


Safe Testing
Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, S-values and safe tests may provide a methodology acceptable to adherents of all three schools.
Time-uniform, nonparametric, nonasymptotic confidence sequences
A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic
Always Valid Inference: Bringing Sequential Analysis to A/B Testing
This work defines always valid p-values and confidence intervals that let users try to take advantage of data as fast as it becomes available, providing valid statistical inference whenever they make their decision.
Likelihood, Replicability and Robbins' Confidence Sequences
The widely claimed replicability crisis in science may lead to revised standards of significance. The customary frequentist confidence intervals, calibrated through hypothetical repetitions of the
Universal inference
A surprisingly simple method for producing statistical significance statements without any regularity conditions and it is shown that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood.
Time-uniform Chernoff bounds via nonnegative supermartingales
We develop a class of exponential bounds for the probability that a martingale sequence crosses a time-dependent linear threshold. Our key insight is that it is both natural and fruitful to formulate
Minimax rates without the fixed sample size assumption.
We generalize the notion of minimax convergence rate. In contrast to the standard definition, we do not assume that the sample size is fixed in advance. Allowing for varying sample size results in
Test Martingales, Bayes Factors and p-Values
A nonnegative martingale with initial value equal to one measures evidence against a probabilistic hypothesis. The inverse of its value at some stopping time can be interpreted as a Bayes factor. If
Conditional infimum and recovery of monotone processes
Monotone processes, just like martingales, can often be recovered from their final values. Examples include running maxima of supermartingales, as well as running maxima, local times, and various
A General Class of Exponential Inequalities for Martingales and Ratios
In this paper we introduce a technique for obtaining exponential inequalities, with particular emphasis placed on results involving ratios. Our main applications consist of approximations to the tail