UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization

@article{Choo2013UTOPIANUT,
  title={UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization},
  author={Jaegul Choo and Changhyun Lee and Chandan K. Reddy and Haesun Park},
  journal={IEEE Transactions on Visualization and Computer Graphics},
  year={2013},
  volume={19},
  pages={1992-2001}
}
Topic modeling has been widely used for analyzing text document collections. [...] Key Method Centered around its semi-supervised formulation, UTOPIAN enables users to interact with the topic modeling method and steer the result in a user-driven manner. We demonstrate the capability of UTOPIAN via several usage scenarios with real-world document corpuses such as InfoVis/VAST paper data set and product review data sets.Expand
TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections
TLDR
This work proposes a novel interaction technique called TopicLens that allows a user to dynamically explore data through a lens interface where topic modeling and the corresponding 2D embedding are efficiently computed on the fly. Expand
Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization
TLDR
This work presents a novel ensemble method based on nonnegative matrix factorization that discovers meaningful local topics of interest to users and extends this ensemble model by adding keyword- and document-based user interaction to introduce user-driven topic discovery. Expand
Nonnegative Matrix Factorization for Interactive Topic Modeling and Document Clustering
TLDR
In the context of clustering, this framework provides a flexible way to extend NMF such as the sparse NMF and the weakly-supervised NMF, which effectively works as the basis for the visual analytic topic modeling system that is presented. Expand
L-EnsNMF: Boosted Local Topic Discovery via Ensemble of Nonnegative Matrix Factorization
TLDR
A novel ensemble model of nonnegative matrix factorization of topic modeling applications that successively performs NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets, which in turn delivers high-quality, focused topics of interest to users. Expand
Semantic Nonnegative Matrix Factorization with Automatic Model Determination for Topic Modeling
TLDR
SeNMFk is introduced, a semantic-assisted NMF-based topic modeling method, which incorporates semantic correlations in NMF by using a word-context matrix, and employs a method for determination of the number of latent topics. Expand
Local Topic Discovery via Boosted Ensemble of Nonnegative Matrix Factorization
TLDR
The novelty of this method lies in the fact that it utilizes the residual matrix inspired by a state-of-theart gradient boosting model and applies a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. Expand
Discovery via Boosted Ensemble of Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) has been increasingly popular for topic modeling of largescale documents. However, the resulting topics often represent only general, thus redundant informationExpand
Human-Centered and Interactive: Expanding the Impact of Topic Models
Statistical topic modeling is a common tool for summarizing the themes in a document corpus. Due to the complexity of topic modeling algorithms, however, their results are not accessible toExpand
LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation
TLDR
LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods, is presented, which is designed for users, who have minimal knowledge of LDA or Topic Modelling methods. Expand
Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization
TLDR
A novel topic modeling method based on joint nonnegative matrix factorization, which simultaneously discovers common as well as discriminative topics given multiple document sets and is capable of utilizing only the most representative, thus meaningful, keywords in each topic through a novel pseudo-deflation approach. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Interactive Topic Modeling
TLDR
This work presents a new way of picking words to represent a topic, and presents a novel method for interactive topic modeling that allows the user to give live feedback on the topics, and allows the inference algorithm to use that feedback to guide the LDA parameter search. Expand
Learning Topic Models -- Going beyond SVD
TLDR
This paper formally justifies Nonnegative Matrix Factorization (NMF) as a main tool in this context, which is an analog of SVD where all vectors are nonnegative, and gives the first polynomial-time algorithm for learning topic models without the above two limitations. Expand
iVisClustering: An Interactive Visual Document Clustering via Topic Modeling
TLDR
An interactive visual analytics system for document clustering, called iVisClustering, is proposed based on a widely‐used topic modeling method, latent Dirichlet allocation (LDA), which provides a summary of each cluster in terms of its most representative keywords and visualizes soft clustering results in parallel coordinates. Expand
TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling
TLDR
A discussion of the design and implementation choices for each visual analysis technique is presented, followed by a discussion of three diverse use cases in which TopicNets enables fast discovery of information that is otherwise hard to find. Expand
ParallelTopics: A probabilistic approach to exploring document collections
TLDR
A novel visual analytics system, Parallel-Topics, which integrates a state-of-the-art probabilistic topic model Latent Dirichlet Allocation (LDA) with interactive visualization to help users make sense of large text corpora. Expand
A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation Using First-Order Logic
TLDR
A scalable inference technique using stochastic gradient descent is developed which may also be useful to the Markov Logic Network (MLN) research community and the expressive power of Foldċall is demonstrated. Expand
Latent Dirichlet Allocation
We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], andExpand
Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons
TLDR
A novel algorithm for NMF based on the ANLS framework that builds upon the block principal pivoting method for the nonnegativity-constrained least squares problem that overcomes a limitation of the active set method is presented. Expand
TextFlow: Towards Better Understanding of Evolving Topics in Text
TLDR
This paper introduces TextFlow, a seamless integration of visualization and topic mining techniques, for analyzing various evolution patterns that emerge from multiple topics, and extends an existing analysis technique to extract three-level features. Expand
DClusterE: A Framework for Evaluating and Understanding Document Clustering Using Visualization
  • Y. Zhang, Tao Li
  • Computer Science
  • TIST
  • 2012
TLDR
DClusterE integrates cluster validation with user interactions and offers rich visualization tools for users to examine document clustering results from multiple perspectives, and provides not only different aspects of document inter/intra-clustering structures, but also the corresponding relationship between clusters results and the ground truth. Expand
...
1
2
3
4
...