Topic Modelling Meets Deep Neural Networks: A Survey

@inproceedings{Zhao2021TopicMM,
  title={Topic Modelling Meets Deep Neural Networks: A Survey},
  author={He Zhao and Dinh Q. Phung and Viet Huynh and Yuan Jin and Lan Du and Wray L. Buntine},
  booktitle={IJCAI},
  year={2021}
}
Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a need to summarise research developments and discuss open problems and future directions. In this paper, we… Expand

Figures from this paper

Neural Attention-Aware Hierarchical Topic Model
TLDR
This work proposes a variational autoencoder (VAE) NTM model that jointly reconstructs the sentence and document word counts using combinations of bag-of-words (BoW) topical embeddings and pre-trained semanticembeddings. Expand
Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence
TLDR
This work assesses a dominant classical model and two state-of-the-art neural models in a systematic, clearly documented, reproducible way, and addresses both the standardization gap and the validation gap in the use of automated topic modeling benchmarks. Expand
Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network
TLDR
Seetooth factorial topic embedding guided GBN is proposed, a deep generative model of documents that captures the dependencies and semantic similarities between the topics in the embedding space and outperforms other neural topic models on extracting deeper interpretable topics and deriving better document representations. Expand
Topic Model or Topic Twaddle? Re-evaluating Semantic Interpretability Measures
TLDR
These evaluations show that for some specialized collections, standard coherence measures may not inform the most appropriate topic model or the optimal number of topics, and current interpretability performance validation methods are challenged as a means to confirm model quality in the absence of ground truth data. Expand
Improving Reader Motivation with Machine Learning
This thesis focuses on the problem of increasing reading motivation with machine learning (ML). The act of reading is central to modern human life, and there is much to be gained by improving theExpand
Exploiting Domain-Aware Aspect Similarity for Multi-Source Cross-Domain Sentiment Classification
Article history: Received: 01 May, 2021 Accepted: 15 June, 2021 Online: 10 July, 2021
A Topic Coverage Approach to Evaluation of Topic Models
TLDR
An approach to topic model evaluation based on measuring topic coverage is investigated, and measures of coverage based on matching between model topics and reference topics are proposed. Expand

References

SHOWING 1-10 OF 89 REFERENCES
Dirichlet belief networks for topic structure learning
TLDR
A new multi-layer generative process on word distributions of topics, where each layer consists of a set of topics and each topic is drawn from a mixture of the topics of the layer above, which is able to discover interpretable topic hierarchies. Expand
A Novel Neural Topic Model and Its Supervised Extension
TLDR
A novel neural topic model (NTM) where the representation of words and documents are efficiently and naturally combined into a uniform framework and is competitive in both topic discovery and classification/regression tasks. Expand
Context Reinforced Neural Topic Modeling over Short Texts
TLDR
A Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows: by assuming that each short text covers only a few salient topics, CRNTM infers the topic for each word in a narrow range. Expand
Copula Guided Neural Topic Modelling for Short Texts
TLDR
This paper focuses on adapting the popular Auto-Encoding Variational Bayes based neural topic models to short texts, by exploring the Archimedean copulas to guide the estimated topic distributions derived from linear projected samples of re-parameterized posterior distributions. Expand
Neural Topic Model with Attention for Supervised Learning
TLDR
A novel way to utilize document-specific topic proportions and global topic vectors learned from neural topic model in the attention mechanism is designed and backpropagation inference method that allows for joint model optimisation is developed. Expand
Neural Topic Model with Reinforcement Learning
TLDR
This paper borrows the idea of reinforcement learning and incorporate topic coherence measures as reward signals to guide the learning of a VAE-based topic model that is able to automatically separating background words dynamically from topic words, thus eliminating the pre-processing step of filtering infrequent and/or top frequent words, typically required for learning traditional topic models. Expand
Neural Variational Correlated Topic Modeling
TLDR
This paper proposes a novel Centralized Transformation Flow to capture the correlations among topics by reshaping topic distributions and presents the Transformation Flow Lower Bound to improve the performance of the proposed model. Expand
A Word Embeddings Informed Focused Topic Model
TLDR
A focused topic model where how a topic focuses on words is informed by word embeddings is proposed, which is able to discover more informed and focused topics with more representative words, leading to better modelling accuracy and topic quality. Expand
Document Informed Neural Autoregressive Topic Models with Distributional Prior
TLDR
Novel neural autoregressive topic model variants that consistently outperform state-of-the-art generative topic models in terms of generalization, interpretability, and applicability over 7 long-text and 8 short-text datasets from diverse domains are presented. Expand
ATM: Adversarial-neural Topic Model
TLDR
The proposed Adversarial-neural Topic Model (ATM) models topics with Dirichlet prior and employs a generator network to capture the semantic patterns among latent topics, and shows that ATM generates more coherence topics, outperforming a number of competitive baselines. Expand
...
1
2
3
4
5
...