# Bayesian Nonlinear Support Vector Machines for Big Data

@inproceedings{Wenzel2017BayesianNS, title={Bayesian Nonlinear Support Vector Machines for Big Data}, author={F. Wenzel and Th{\'e}o Galy-Fajou and Matth{\"a}us Deutsch and M. Kloft}, booktitle={ECML/PKDD}, year={2017} }

We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates and automatic hyperparameter search.

## 22 Citations

### Scalable Logit Gaussian Process Classification

- Computer Science
- 2017

We propose an efficient stochastic variational approach to Gaussian Process (GP) classification building on Pólya-Gamma data augmentation and inducing points, which is based on closed-form updates of…

### Fast Inference in Non-Conjugate Gaussian Process Models via Data Augmentation

- Computer Science
- 2018

This work presents AugmentedGaussianProcesses.jl, a software package for augmented stochastic variational inference (ASVI) for Gaussian process models with non-conjugate likelihood functions, and demonstrates that it is up to two orders of magnitude faster than the state-of-the-art.

### Scalable Multi-Class Bayesian Support Vector Machines for Structured and Unstructured Data

- Computer ScienceArXiv
- 2018

A new Bayesian multi-class support vector machine is introduced by formulating a pseudo-likelihood for a multi- class hinge loss in the form of a location-scale mixture of Gaussians that demonstrates its effectiveness in the tasks of large-scale active learning and detection of adversarial images.

### Scalable Multi-Class Gaussian Process Classification via Data Augmentation

- Computer Science
- 2018

This paper proposes a new scalable multi–class Gaussian process classification approach building on a novel modified softmax likelihood function. This form of likelihood allows for a latent variable…

### A binary-response regression model based on support vector machines

- Computer Science
- 2020

This work considers a probabilistic regression model for binary-response data that is based on the optimization problem that characterizes the SVM, and proves that the maximum likelihood estimate (MLE) of the model exists, and that it is consistent and asymptotically normal.

### Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models

- Computer ScienceAISTATS
- 2020

The proposed automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models, develops two inference methods, a fast and scalable stochastic variational inference method that uses efficient block coordinate ascent updates, which are computed in closed form.

### Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation

- Computer ScienceAAAI
- 2019

We propose a scalable stochastic variational approach to GP classification building on Pólya-Gamma data augmentation and inducing points. Unlike former approaches, we obtain closed-form updates based…

### Generalizing and Scaling up Dynamic Topic Models via Inducing Point Variational Inference

- Computer Science
- 2017

The class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs) is extended and it is shown how to perform scalable approximate inference in these models based on ideas around stochastic variational inference andGaussian processes with inducing points.

### Scalable Generalized Dynamic Topic Models

- Computer ScienceAISTATS
- 2018

This paper extends the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs), which allows to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection).

### Scalable Large Margin Gaussian Process Classification

- Computer ScienceECML/PKDD
- 2019

A new Large Margin Gaussian Process (LMGP) model is introduced by formulating a pseudo-likelihood for a generalised multi-class hinge loss and a highly scalable training objective is derived using variational-inference and inducing point approximation.

## References

SHOWING 1-10 OF 34 REFERENCES

### Mean field variational Bayesian inference for support vector machine classification

- Computer ScienceComput. Stat. Data Anal.
- 2014

### Bayesian Nonlinear Support Vector Machines and Discriminative Factor Modeling

- Computer ScienceNIPS
- 2014

An extensive set of experiments demonstrate the utility of using a nonlinear Bayesian SVM within discriminative feature learning and factor modeling, from the standpoints of accuracy and interpretability.

### Gaussian Processes for Big Data

- Computer ScienceUAI
- 2013

Stochastic variational inference for Gaussian process models is introduced and it is shown how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform Variational inference.

### Data augmentation for support vector machines

- Computer Science
- 2011

A latent variable representation of regularized support vector machines that enables EM, ECME or MCMC algorithms to provide parameter estimates and shows how to implementing SVM’s with spike-and-slab priors and running them against data from a standard spam filtering data set.

### Stochastic variational inference

- Computer ScienceJ. Mach. Learn. Res.
- 2013

Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.

### Scalable Variational Gaussian Process Classification

- Computer ScienceAISTATS
- 2015

This work shows how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets, and can be exploited to allow classification in problems with millions of data points.

### An Adaptive Learning Rate for Stochastic Variational Inference

- Computer ScienceICML
- 2013

This work develops an adaptive learning rate for stochastic variational inference, which requires no tuning and is easily implemented with computations already made in the algorithm.

### Variational Learning of Inducing Variables in Sparse Gaussian Processes

- Computer ScienceAISTATS
- 2009

A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.

### Fast Max-Margin Matrix Factorization with Data Augmentation

- Computer ScienceICML
- 2013

This paper presents a probabilistic M3F model that admits a highly efficient Gibbs sampling algorithm through data augmentation and extends the approach to incorporate Bayesian nonparametrics and build accordingly a truncation-free nonparametric M3f model where the number of latent factors is literally unbounded and inferred from data.

### Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

- Computer Science
- 2005

The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.