• Corpus ID: 12867258

# A scaled Bregman theorem with applications

@article{Nock2016ASB,
title={A scaled Bregman theorem with applications},
author={Richard Nock and Aditya Krishna Menon and Cheng Soon Ong},
journal={ArXiv},
year={2016},
volume={abs/1607.00360}
}
• Published 1 July 2016
• Computer Science, Mathematics
• ArXiv
Bregman divergences play a central role in the design and analysis of a range of machine learning algorithms. This paper explores the use of Bregman divergences to establish reductions between such algorithms and their analyses. We present a new scaled isodistortion theorem involving Bregman divergences (scaled Bregman theorem for short) which shows that certain "Bregman distortions'" (employing a potentially non-convex generator) may be exactly re-written as a scaled Bregman divergence…

## Figures and Tables from this paper

Some New Flexibilizations of Bregman Divergences and Their Asymptotics
• Computer Science
GSI
• 2017
Some new divergences between (non-)probability distributions are introduced which particularly cover the corresponding OBD, SBD, COD and CSBD (for separable situations) as special cases and non-convex generators are employed.
3D Insights to Some Divergences for Robust Statistics and Machine Learning
• Computer Science
GSI
• 2017
A special SBD subclass is constructed which covers both the often used power divergences (of CASD type) as well as their robustness-enhanced extensions with non-convexnon-concave $$\phi$$.
Some Universal Insights on Divergences for Statistics, Machine Learning and Artificial Intelligence
• Computer Science
Geometric Structures of Information
• 2018
A correspondingly unifying framework is presented which – by its nature as a “structure on structures” – also qualifies as a basis for similarity-based multistage AI and more humanlike (robustly generalizing) machine learning.
Representation Learning of Compositional Data
• Computer Science
NeurIPS
• 2018
This work focuses on principal component analysis (PCA) and proposes an approach that allows low dimensional representation learning directly from the original data, and includes a convenient surrogate (upper bound) loss of the exponential family PCA which has an easy to optimize form.
A Unified Framework for Multi-distribution Density Ratio Estimation
• Computer Science, Mathematics
ArXiv
• 2021
A general framework from the perspective of Bregman divergence minimization is developed, justifying the use of any strictly proper scoring rule composite with a link function for multi-distribution DRE and leading to methods that strictly generalize their counterparts in binary DRE, as well as new methods that show comparable or superior performance on various downstream tasks.
Supervised Learning: No Loss No Cry
• Computer Science
ICML
• 2020
This paper revisits the {\sc SLIsotron} algorithm of Kakade et al. (2011) through a novel lens, derive a generalisation based on Bregman divergences, and shows how it provides a principled procedure for learning the loss.
• Computer Science
ArXiv
• 2022
A new path forward for the generation of tabular data is proposed, exploiting decades-old understanding of the supervised task’s best components for DT induction, from losses (properness), models (tree-based) to algorithms (boosting).
Metrics Downloaded : 0 Viewed : 0 Size : 326 . 66 KB Type : application / pdf
Some new divergences between (non-)probability distributions which particularly cover the corresponding OBD, SBD, COD and CSBD (for separable situations) as special cases are introduced.
A new toolkit for robust distributional change detection
• Computer Science
Applied Stochastic Models in Business and Industry
• 2018

## References

SHOWING 1-10 OF 52 REFERENCES
On Conformal Divergences and Their Population Minimizers
• Computer Science
IEEE Transactions on Information Theory
• 2016
It is proved that conformal divergences are essentially exhaustive for their left and right population minimizers, and the role of the (u, v) -geometric structure in clustering is discussed.
Clustering with Bregman Divergences
• Computer Science
J. Mach. Learn. Res.
• 2005
This paper proposes and analyzes parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences, and shows that there is a bijection between regular exponential families and a largeclass of BRegman diverGences, that is called regular Breg man divergence.
Mixed Bregman Clustering with Approximation Guarantees
• Computer Science
ECML/PKDD
• 2008
This paper first unite the two frameworks by generalizing the former improvement to Bregman seeding by integrating the distortion at hand in both the initialization and iterative steps, and generalizing its important theoretical approximation guarantees as well.
Matrix Nearness Problems with Bregman Divergences
• Computer Science, Mathematics
SIAM J. Matrix Anal. Appl.
• 2007
This paper discusses a new class of matrix nearness problems that measure approximation error using a directed distance measure called a Bregman divergence, and proposes a framework for studying these problems, discusses some specific matrixNearness problems, and provides algorithms for solving them numerically.
Bregman Divergences and Surrogates for Learning
• Computer Science
IEEE Transactions on Pattern Analysis and Machine Intelligence
• 2009
This paper addresses the problem for a wide set which lies at the intersection of classification calibrated surrogates and those of Murata et al. (2004), and gives a minimization algorithm provably converging to the minimum of any such surrogate.
Bregman Divergences and Triangle Inequality
• Mathematics
SDM
• 2013
This paper investigates the relationship between two families of symmetrized Bregman divergences and metrics that satisfy the triangle inequality, and interpret the required structure in terms of cumulants of infinitely divisible distributions, and related results in harmonic analysis.
Logistic Regression, AdaBoost and Bregman Distances
• Computer Science
Machine Learning
• 2004
A unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances, and a parameterized family of algorithms that includes both a sequential- and a parallel-update algorithm as special cases are described, thus showing how the sequential and parallel approaches can themselves be unified.
Low-Rank Kernel Learning with Bregman Matrix Divergences
• Computer Science
J. Mach. Learn. Res.
• 2009
This paper proposes efficient algorithms that scale linearly in the number of data points and quadratically in the rank of the input matrix and employs Bregman matrix divergences as the measures of nearness for low-rank matrix nearness problems.
Efficient Bregman Range Search
An algorithm for efficient range search when the notion of dissimilarity is given by a Bregman divergence is developed based on a recently proposed space decomposition for B Regman divergences.