• Corpus ID: 2418334

Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves

  title={Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves},
  author={Fei Tian and Bin Gao and Di He and Tie-Yan Liu},
We propose Sentence Level Recurrent Topic Model (SLRTM), a new topic model that assumes the generation of each word within a sentence to depend on both the topic of the sentence and the whole history of its preceding words in the sentence. [] Key Result Experimental results have shown that SLRTM outperforms several strong baselines on various tasks.

Figures and Tables from this paper

Sentence level topic models for associated topics extraction

An associated topic model (ATM) is developed, in which consecutive sentences are considered important and the topic assignments for words are jointly determined by the association matrix and the sentence level topic distributions, instead of the document-specific topic distributions only.

Language Model-Driven Topic Clustering and Summarization for News Articles

A Language Model-based Topic Model (LMTM) for Topic Clustering is proposed by using an LM to generate a deep contextualized word representation and the generated readable and reasonable summaries validate the rationality of the model components.

Topic-Transformer for Document-Level Language Understanding

This study focuses on simultaneously capturing syntax and global semantics from a text, thus acquiring document-level understanding with a Topic-Transformer that combines the benefits of a neural topic model that captures global semantic information and a transformer-based language model, which can capture the local structure of texts both semantically and syntactically.

A Text Generation Model that Maintains the Order of Words, Topics, and Parts of Speech via Their Embedding Representations and Neural Language Models

This work focuses here on parts of speech (POS) (e.g. noun, verb, preposition, etc.) so as to enhance these models, and allow for truly coherent text more efficiently than is possible by using any of them in isolation.

Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data

In this paper, Latent LSTM Allocation (LLA) is introduced for user modeling combining hierarchical Bayesian models with LSTMs, and an efficient Stochastic EM inference algorithm is presented for this model that scales to millions of users/documents.

A hybrid neural network hidden Markov model approach for automatic story segmentation

Experimental results on the TDT2 corpus show that the proposed NN-HMM approach outperforms the traditional HMM approach significantly and achieves state-of-the-art performance in story segmentation.

A hybrid neural network hidden Markov model approach for automatic story segmentation

Experimental results on the TDT2 corpus show that the proposed NN-HMM approach outperforms the traditional HMM approach significantly and achieves state-of-the-art performance in story segmentation.

Uncovering Hidden Structure in Sequence Data via Threading Recurrent Models

An efficient sampler based on particle MCMC method for inference that can draw from the joint posterior directly is presented and Experimental results confirm the superiority of thLLA and the stability of the new inference algorithm on a variety of domains.

Topic Modeling using Variational Auto-Encoders with Gumbel-Softmax and Logistic-Normal Mixture Distributions

Two new text topic models based on the categorical distribution Gumbel-Softmax (GSDTM) and on mixtures of Logistic-Normal distributions (LMDTM) are proposed, and it is shown that GSDTM largely outperforms previous state-of-the-art baselines when considering three different evaluation metrics.



Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

The recently proposed hierarchical recurrent encoder-decoder neural network is extended to the dialogue domain, and it is demonstrated that this model is competitive with state-of-the-art neural language models and back-off n-gram models.

Topic modeling: beyond bag-of-words

A hierarchical generative probabilistic model that incorporates both n-gram statistics and latent topic variables by extending a unigram topic model to include properties of a hierarchical Dirichlet bigram language model is explored.

A Novel Neural Topic Model and Its Supervised Extension

A novel neural topic model (NTM) is proposed where the representation of words and documents are efficiently and naturally combined into a uniform framework and is competitive in both topic discovery and classification/regression tasks.

Hidden Topic Markov Models

This paper proposes modeling the topics of words in the document as a Markov chain, and shows that incorporating this dependency allows us to learn better topics and to disambiguate words that can belong to different topics.

Ordering-Sensitive and Semantic-Aware Topic Modeling

This paper presents a Gaussian Mixture Neural Topic Model (GMNTM), which incorporates both the ordering of words and the semantic meaning of sentences into topic modeling and can learn better topics and more accurate word distributions for each topic.

Neural Responding Machine for Short-Text Conversation

Empirical study shows that NRM can generate grammatically correct and content-wise appropriate responses to over 75% of the input text, outperforming state-of-the-arts in the same setting, including retrieval-based and SMT-based models.

Topic sentiment mixture: modeling facets and opinions in weblogs

The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments and could also provide general sentiment models that are applicable to any ad hoc topics.

A Neural Autoregressive Topic Model

A new model for learning meaningful representations of text documents from an unlabeled collection of documents that takes inspiration from the conditional mean-field recursive equations of the Replicated Softmax to define a neural network architecture that estimates the probability of observing a new word in a given document given the previously observed words.

Structural Topic Model for Latent Topical Structure Analysis

A new topic model is proposed, Structural Topic Model, which simultaneously discovers topics and reveals the latent topical structures in text through explicitly modeling topical transitions with a latent first-order Markov chain.

Gaussian LDA for Topic Models with Word Embeddings

Gaussian LDA is replaced with multivariate Gaussian distributions on the embedding space, which encourages the model to group words that are a priori known to be semantically related into topics into topics.