Erratum: “Improving Topic Models with Latent Feature Word Representations”

  title={Erratum: “Improving Topic Models with Latent Feature Word Representations”},
  author={Dat Quoc Nguyen and Richard Billingsley and Lan Du and Mark Johnson},
  journal={Transactions of the Association for Computational Linguistics},
Change in clustering and classification results due to the dmm and lf-dmm bugs. 
10 Citations

Graph Attention Topic Modeling Network

A new method to overcome the overfitting issue of pLSI is provided by using the amortized inference with word embedding as input, instead of the Dirichlet prior in LDA.

Discriminative Topic Mining via Category-Name Guided Text Embedding

CatE is developed, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discrim inative embedding space and discover category representative terms in an iterative manner.

Jointly Learning Word Embeddings and Latent Topics

The experimental results demonstrate that the STE model can indeed generate useful topic-specific word embeddings and coherent latent topics in an effective and efficient way.

Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts

This paper empirically shows that by incorporating both global and local context, this collaborative model can not only significantly improve the performance of topic discovery over the baseline topic models, but also learn better word embeddings than the baseline word embedding models.

Variational low rank multinomials for collaborative filtering with side-information

A simple and flexible framework to build models for collaborative filtering that incorporate side-information or metadata about items in addition to user-item interaction data and a efficient technique for approximating posteriors over model parameters using variational inference are developed.

KATE: K-Competitive Autoencoder for Text

This paper proposes a novel k-competitive autoencoder, called KATE, for text documents that outperforms deep generative models, probabilistic topic models, and even word representation models in terms of several downstream tasks such as document classification, regression, and retrieval.

Exploring Time-Sensitive Variational Bayesian Inference LDA for Social Media Data

This paper examines the performance of the VB-based topic modelling approach for producing coherent topics, and proposes a novel time-sensitive Variational Bayesian implementation, denoted as TVB, which can more accurately estimate topical trends, making it particularly suitable to assist end-users in tracking emerging topics on social media.

Seekers, Providers, Welcomers, and Storytellers: Modeling Social Roles in Online Health Communities

It is found that members frequently change roles over their history, from ones that seek resources to ones offering help, while the distribution of roles is stable over the community's history.

Tiradentes no TripAdvisor - O que se fala sobre essa simpática cidade histórica?

O turismo é uma área que teve grandes impactos com expansão da internet. Hoje é possível planejar uma viagem de casa, usando somente informações da web. No entanto, os usuários chegaram a um ponto em



Word Features for Latent Dirichlet Allocation

We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved

A Novel Neural Topic Model and Its Supervised Extension

A novel neural topic model (NTM) is proposed where the representation of words and documents are efficiently and naturally combined into a uniform framework and is competitive in both topic discovery and classification/regression tasks.

Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality

This work explores the two tasks of automatic Evaluation of single topics and automatic evaluation of whole topic models, and provides recommendations on the best strategy for performing the two task, in addition to providing an open-source toolkit for topic and topic model evaluation.

Sprite: Generalizing Topic Models with Structured Priors

A Sprite-based model is constructed to jointly infer topic hierarchies and author perspective, which is applied to corpora of political debates and online reviews and shows that the model learns intuitive topics, outperforming several other topic models at predictive tasks.

Latent Dirichlet Allocation

Improving LDA topic models for microblogs via tweet pooling and automatic labeling

This paper empirically establishes that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a range of pooling schemes.

Replicated Softmax: an Undirected Topic Model

We introduce a two-layer undirected graphical model, called a "Replicated Softmax", that can be used to model and automatically extract low-dimensional latent semantic representations from a large

Reading Tea Leaves: How Humans Interpret Topic Models

New quantitative methods for measuring semantic meaning in inferred topics are presented, showing that they capture aspects of the model that are undetected by previous measures of model quality based on held-out likelihood.

A Hidden Topic-Based Framework toward Building Applications with Short Web Documents

A hidden topic-based framework for processing short and sparse documents on the Web that common hidden topics discovered from large external data sets (universal data sets), when included, can make short documents less sparse and more topic-oriented.

Sparse Additive Generative Models of Text

This approach has two key advantages: it can enforce sparsity to prevent overfitting, and it can combine generative facets through simple addition in log space, avoiding the need for latent switching variables.