• Corpus ID: 14110220

Kernel Topic Models

  title={Kernel Topic Models},
  author={Philipp Hennig and David H. Stern and Ralf Herbrich and Thore Graepel},
  booktitle={International Conference on Artificial Intelligence and Statistics},
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents’ mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is… 

Figures from this paper

Scalable Generalized Dynamic Topic Models

This paper extends the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs), which allows to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection).

Neural Variational Correlated Topic Modeling

This paper proposes a novel Centralized Transformation Flow to capture the correlations among topics by reshaping topic distributions and presents the Transformation Flow Lower Bound to improve the performance of the proposed model.

Efficient integration of generative topic models into discriminative classifiers using robust probabilistic kernels

We propose an alternative to the generative classifier that usually models both the class conditionals and class priors separately, and then uses the Bayes theorem to compute the posterior

Efficient inference for dynamic topic modeling with large vocabularies

Dynamic topic modeling is a well established tool for capturing the temporal dynamics of the topics of a corpus. In this work, we develop a scalable dynamic topic model by utilizing the correlation

Neural Variational Inference For Topic Models

This work presents what is to their knowledge the first effective neural variational inference method for latent Dirichlet allocation (LDA), tackling the problems caused for NVI by theDirichlet prior and by component collapsing and finds that NVI matches traditional methods in accuracy with much better inference time.

Generalizing and Scaling up Dynamic Topic Models via Inducing Point Variational Inference

The class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs) is extended and it is shown how to perform scalable approximate inference in these models based on ideas around stochastic variational inference andGaussian processes with inducing points.

Learning Multilingual Topics with Neural Variational Inference

A new multilingual topic model is proposed that permits training by backpropagation in the framework of neural variational inference to infer topic distributions via a shared inference network to capture common word semantics and an incorporating module to incorporate the topic-word distribution from another language through a novel transformation method.

Variational Gaussian Topic Model with Invertible Neural Projections

To address the limitation that pre-trained word embeddings of topicassociated words do not follow a multivariate Gaussian, Variational Gaussian Topic Model with Invertible neural Projections (VaGTM-IP) is extended from VaGTM.

A Model of Text for Experimentation in the Social Sciences

A hierarchical mixed membership model for analyzing topical content of documents, in which mixing weights are parameterized by observed covariates is posit, enabling researchers to introduce elements of the experimental design that informed document collection into the model, within a generally applicable framework.

Laplace Matching for fast Approximate Inference in Latent Gaussian Models

Laplace Matching, an approximate inference framework primarily designed to be computationally cheap while still achieving high approximation quality, is proposed, showing approximation quality comparable to state-of-theart approximate inference techniques at a drastic reduction in computational cost.



Latent Dirichlet Allocation

A correlated topic model of Science

The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution, and it is demonstrated its use as an exploratory tool of large document collections.

Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression

A Dirichlet-multinomial regression topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates is proposed.

Dynamic topic models

A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.

Continuous Time Dynamic Topic Models

An efficient variational approximate inference algorithm is derived that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points.

The Author-Topic Model for Authors and Documents

The author-topic model is introduced, a generative model for documents that extends Latent Dirichlet Allocation to include authorship information, and applications to computing similarity between authors and entropy of author output are demonstrated.

Online Learning for Latent Dirichlet Allocation

An online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA) based on online stochastic optimization with a natural gradient step is developed, which shows converges to a local optimum of the VB objective function.

Topic models

The specific topic model the authors consider is called latent Dirichlet allocation (LDA), based on the intuition that each document contains words from multiple topics; the proportion of each topic in each document is different, but the topics themselves are the same for all documents.

Topics over time: a non-Markov continuous-time model of topical trends

An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends.

Markov Random Topic Fields

An topic model is presented that makes use of one or more user-specified graphs describing relationships between documents that are encoded in the form of a Markov random field over topics and serve to encourage related documents to have similar topic structures.