Bayesian Nonlinear Support Vector Machines for Big Data

  title={Bayesian Nonlinear Support Vector Machines for Big Data},
  author={F. Wenzel and Th{\'e}o Galy-Fajou and Matth{\"a}us Deutsch and M. Kloft},
We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates and automatic hyperparameter search. 
Scalable Logit Gaussian Process Classification
We propose an efficient stochastic variational approach to Gaussian Process (GP) classification building on Pólya-Gamma data augmentation and inducing points, which is based on closed-form updates ofExpand
Fast Inference in Non-Conjugate Gaussian Process Models via Data Augmentation
We present AugmentedGaussianProcesses.jl, a software package for augmented stochastic variational inference (ASVI) for Gaussian process models with non-conjugate likelihood functions. The idea ofExpand
Scalable Multi-Class Bayesian Support Vector Machines for Structured and Unstructured Data
A new Bayesian multi-class support vector machine is introduced by formulating a pseudo-likelihood for a multi- class hinge loss in the form of a location-scale mixture of Gaussians that demonstrates its effectiveness in the tasks of large-scale active learning and detection of adversarial images. Expand
Scalable Multi-Class Gaussian Process Classification via Data Augmentation
This paper proposes a new scalable multi–class Gaussian process classification approach building on a novel modified softmax likelihood function. This form of likelihood allows for a latent variableExpand
A binary-response regression model based on support vector machines
This work considers a probabilistic regression model for binary-response data that is based on the optimization problem that characterizes the SVM, and proves that the maximum likelihood estimate (MLE) of the model exists, and that it is consistent and asymptotically normal. Expand
Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models
The proposed automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models, develops two inference methods, a fast and scalable stochastic variational inference method that uses efficient block coordinate ascent updates, which are computed in closed form. Expand
Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation
We propose a scalable stochastic variational approach to GP classification building on Pólya-Gamma data augmentation and inducing points. Unlike former approaches, we obtain closed-form updates basedExpand
Generalizing and Scaling up Dynamic Topic Models via Inducing Point Variational Inference
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that topics change continuously over time and thereforeExpand
Scalable Generalized Dynamic Topic Models
This paper extends the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs), which allows to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection). Expand
Scalable Large Margin Gaussian Process Classification
A new Large Margin Gaussian Process (LMGP) model is introduced by formulating a pseudo-likelihood for a generalised multi-class hinge loss and a highly scalable training objective is derived using variational-inference and inducing point approximation. Expand


Mean field variational Bayesian inference for support vector machine classification
This representation allows circumvention of many of the shortcomings associated with classical SVMs including automatic penalty parameter selection, the ability to handle dependent samples, missing data and variable selection, and outperforms the classical SVM approach whilst remaining computationally efficient. Expand
Bayesian Nonlinear Support Vector Machines and Discriminative Factor Modeling
An extensive set of experiments demonstrate the utility of using a nonlinear Bayesian SVM within discriminative feature learning and factor modeling, from the standpoints of accuracy and interpretability. Expand
Gaussian Processes for Big Data
Stochastic variational inference for Gaussian process models is introduced and it is shown how GPs can be variationally decomposed to depend on a set of globally relevant inducing variables which factorize the model in the necessary manner to perform Variational inference. Expand
Data augmentation for support vector machines
Summary This paper presents a latent variable representation of regularized support vector machines (SVM’s) that enables EM, ECME or MCMC algorithms to provide parameter estimates. We verify ourExpand
Stochastic variational inference
Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart. Expand
Scalable Variational Gaussian Process Classification
This work shows how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets, and can be exploited to allow classification in problems with millions of data points. Expand
An Adaptive Learning Rate for Stochastic Variational Inference
This work develops an adaptive learning rate for stochastic variational inference, which requires no tuning and is easily implemented with computations already made in the algorithm. Expand
Variational Learning of Inducing Variables in Sparse Gaussian Processes
  • M. Titsias
  • Mathematics, Computer Science
  • 2009
A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood. Expand
Fast Max-Margin Matrix Factorization with Data Augmentation
This paper presents a probabilistic M3F model that admits a highly efficient Gibbs sampling algorithm through data augmentation and extends the approach to incorporate Bayesian nonparametrics and build accordingly a truncation-free nonparametric M3f model where the number of latent factors is literally unbounded and inferred from data. Expand
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification. Expand