• Corpus ID: 17798922

Probable convexity and its application to Correlated Topic Models

  title={Probable convexity and its application to Correlated Topic Models},
  author={Khoat Than and Tu Bao Ho},
Non-convex optimization problems often arise from probabilistic modeling, such as estimation of posterior distributions. Non-convexity makes the problems intractable, and poses various obstacles for us to design efficient algorithms. In this work, we attack non-convexity by first introducing the concept of \emph{probable convexity} for analyzing convexity of real functions in practice. We then use the new concept to analyze an inference problem in the \emph{Correlated Topic Model} (CTM) and… 

Figures and Tables from this paper



Complexity of Inference in Latent Dirichlet Allocation

This work studies the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document's topic distribution is integrated out, and shows that, when the effective number of topics per document is small, exact inference takes polynomial time, and that this problem is NP-hard.

On Tight Approximate Inference of the Logistic-Normal Topic Admixture Model

A new, tight approximate inference algorithm for LoNTAM is presented based on a multivariate quadratic Taylor approximation scheme that facilitates elegant closed-form message passing and leads to more accurate recovery of the semantic truth underlying documents and estimates of the parameters comparing to previous methods.

A correlated topic model of Science

The correlated topic model (CTM) is developed, where the topic proportions exhibit correlation via the logistic normal distribution, and it is demonstrated its use as an exploratory tool of large document collections.

Multi-field Correlated Topic Modeling

A new extension of the CTM method to enable modeling with multi-field topics in a global graphical structure, and a mean-field variational algorithm to allow joint learning of multinomial topic models from discrete data and Gaussianstyle topic models for real-valued data are proposed.

Fully Sparse Topic Models

This paper shows that FSTM can perform substantially better than various existing topic models by different performance measures, and provides a principled way to directly trade off sparsity of solutions against inference quality and running time.

Independent factor topic models

Independent Factor Topic Models (IFTM) are proposed which use linear latent variable models to uncover the hidden sources of correlation between topics to provide a fast Newton-Raphson based variational inference algorithm.

Projection-free Online Learning

This work presents efficient online learning algorithms that eschew projections in favor of much more efficient linear optimization steps using the Frank-Wolfe technique, and obtains a range of regret bounds for online convex optimization, with better bounds for specific cases such as stochastic online smooth conveX optimization.

Convex Optimization

A comprehensive introduction to the subject of convex optimization shows in detail how such problems can be solved numerically with great efficiency.

Covariance in Unsupervised Learning of Probabilistic Grammars

An alternative to the Dirichlet prior is suggested, a family of logistic normal distributions that permits soft parameter tying within grammars and across Grammars for text in different languages, and empirical gains in a novel learning setting using bilingual, non-parallel data are shown.

Optimizing Semantic Coherence in Topic Models

A novel statistical topic model based on an automated evaluation metric based on this metric that significantly improves topic quality in a large-scale document collection from the National Institutes of Health (NIH).