• Corpus ID: 29166222

Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time

  title={Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time},
  author={Asish Ghoshal and Jean Honorio},
MAP perturbation models have emerged as a powerful framework for inference in structured prediction. Such models provide a way to efficiently sample from the Gibbs distribution and facilitate predictions that are robust to random noise. In this paper, we propose a provably polynomial time randomized algorithm for learning the parameters of perturbed MAP predictors. Our approach is based on minimizing a novel Rademacher-based generalization bound on the expected loss of a perturbed MAP predictor… 

Figures from this paper

Minimax bounds for structured prediction

This work provides minimax bounds for a class of factor-graph inference models for structured prediction, which characterize the necessary sample complexity for any conceivable algorithm to achieve learning offactor-graph predictors.

Towards Sharper Generalization Bounds for Structured Prediction

This paper investigates the generalization performance of structured prediction learning and obtains state-of-the-art generalization bounds from three different perspectives: Lipschitz continuity, smoothness, and space capacity condition.

Fast and Efficient DNN Deployment via Deep Gaussian Transfer Learning

  • Qi SunChen Bai Bei Yu
  • Computer Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
A novel transfer learning method based on deep Gaussian processes (DGPs) that achieves the best inference latencies of convolutions while accelerating the optimization process significantly, compared with previous arts.

Minimax Bounds for Structured Prediction Based on Factor Graphs

This work provides minimax lower bounds for a class of general factor-graph inference models in the context of structured prediction, and characterize the necessary sample complexity for any conceivable algorithm to achieve learning of generalfactor-graph predictors.



Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions

Efficient methods for learning random MAP predictors for structured label problems are developed, and it is shown that any smooth posterior distribution would suffice to define a smooth PAC-Bayesian risk bound suitable for gradient methods.

On Measure Concentration of Random Maximum A-Posteriori Perturbations

New measure concentration inequalities are developed that bound the number of samples needed to estimate such expected values in the maximum a-posteriori perturbation framework.

Learning with Maximum A-Posteriori Perturbation Models

This paper analyzes, extends and seeks to estimate dependent perturbations over the parameters using a hardEM approach, cast in the form of inverse convex programs.

Structured Prediction: From Gaussian Perturbations to Linear-Time Principled Algorithms

This work study's this family of loss functions in the PAC-Bayes framework under Gaussian perturbations produces a tighter upper bound of the Gibbs decoder distortion than commonly used methods.

On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations

This paper provides means for drawing either approximate or unbiased samples from Gibbs' distributions by introducing low dimensional perturbations and solving the corresponding MAP assignments, which leads to new ways to derive lower bounds on partition functions.

On the Partition Function and Random Maximum A-Posteriori Perturbations

A novel framework for approximating and bounding the partition function using MAP inference on randomly perturbed models is provided and it is shown that the method excels in the typical "high signal - high coupling" regime that results in ragged energy landscapes difficult for alternative approaches.

More data means less inference: A pseudo-max approach to structured learning

This work shows that it is possible to circumvent this difficulty when the distribution of training examples is rich enough, via a method similar in spirit to pseudo-likelihood, and achieves consistency.

Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models

A novel way to induce a random field from an energy function on discrete labels by locally injecting noise to the energy potentials, followed by finding the global minimum of the perturbed energy function is proposed.

Complexity of Inference in Graphical Models

It is shown that low treewidth is indeed the only structural restriction of the underlying graph that can ensure tractability, and that even for the "best case" graph structure, there is no inference algorithm with complexity polynomial in thetreewidth.

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.