Termite: visualization techniques for assessing textual topic models

  title={Termite: visualization techniques for assessing textual topic models},
  author={Jason Chuang and Christopher D. Manning and Jeffrey Heer},
  booktitle={International Working Conference on Advanced Visual Interfaces},
Topic models aid analysis of text corpora by identifying latent topics based on co-occurring words. [] Key Method We contribute a novel saliency measure for selecting relevant terms and a seriation algorithm that both reveals clustering structure and promotes the legibility of related terms. In a series of examples, we demonstrate how Termite allows analysts to identify coherent and significant themes.

Figures from this paper

Hiearchie: Visualization for Hierarchical Topic Models

Hiérarchie is presented, an interactive visualization that adds structure to large topic models, making them approachable and useful to an end user and demonstrating its ability to analyze a diverse document set regarding a trending news topic.

Interactive Visualization for Topic Model Curation

This paper uses interactive topic modeling of the White House online petition data as a lens to bring up key points of discussions and to highlight the unsolved problems as well as potentials utilities of visual analytics methods.

Topicks: Visualizing complex topic models for user comprehension

The interactive visualization of topic models is a promising approach to summarizing large sets of textual data by incorporating a radial layout and interacting with topic and term nodes provides the user with various ways to manipulate the visualization and explore the data.

Interactive Visual Exploration of Topic Models using Graphs

A novel design that uses graphs to visually communicate topic structure and meaning by connecting topic nodes via descriptive keyterms, which reveals topic similarities, topic meaning and shared, ambiguous keyterms.

Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework

This work presents a modular visual analytics framework, tackling the understandability and adaptability of topic models through a user-driven reinforcement learning process which does not require a deep understanding of the underlying topic modeling algorithms.

erarchie: Interactive Visualization for Hierarchical Topic Models

Hi ´ erarchie is presented, an interactive visualization that adds structure to large topic models, making them approachable and useful to an end user.

Evaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Topic Labels

This study compares labels generated by users given four topic visualization techniques—word lists, word lists with bars, word clouds, and network graphs—against each other and against automatically generated labels.

A Topic-Based Search, Visualization, and Exploration System

This paper uses a popular topic modeling algorithm, Latent Dirichlet Allocation, to derive topic distributions for articles and allows users to specify personal topic distribution to contextualize the exploration experience.

Topic Models and Metadata for Visualizing Text Corpora

A new web-based tool that integrates topics learned from an unsupervised topic model in a faceted browsing experience that the user can manage topics, filter documents by topic and summarize views with metadata and topic graphs is presented.

Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment

This work compares 10,000 topic model variants to 200 expert-provided domain concepts, and demonstrates how the framework can inform choices of model parameters, inference algorithms, and intrinsic measures of topical quality.



The Topic Browser An Interactive Tool for Browsing Topic Models

This work presents an interactive tool that incorporates both prior work in displaying topic models as well as some novel ideas that greatly enhance the visualization of these models.

Reading Tea Leaves: How Humans Interpret Topic Models

New quantitative methods for measuring semantic meaning in inferred topics are presented, showing that they capture aspects of the model that are undetected by previous measures of model quality based on held-out likelihood.

Interpretation and trust: designing model-driven visualizations for text analysis

A novel similarity measure for text collections based on a notion of "word-borrowing" that arose from an iterative design process and a set of design recommendations that describe how they promote interpretable and trustworthy visual analysis tools.

Optimizing Semantic Coherence in Topic Models

A novel statistical topic model based on an automated evaluation metric based on this metric that significantly improves topic quality in a large-scale document collection from the National Institutes of Health (NIH).

Topic Significance Ranking of LDA Generative Models

This paper presents the first automated unsupervised analysis of LDA models to identify junk topics from legitimate ones, and to rank the topic significance.

Evaluating topic models for digital libraries

This large-scale user study includes over 70 human subjects evaluating and scoring almost 500 topics learned from collections from a wide range of genres and domains and shows how scoring model -- based on pointwise mutual information of word-pair using Wikipedia, Google and MEDLINE as external data sources - performs well at predicting human scores.

TileBars: visualization of term distribution information in full text information access

This paper argues for making use of text structure when retrieving from full text documents, and presents a visualization paradigm, called TileBars, that demonstrates the usefulness of explicit term distribution information in Boolean-type queries.

Jigsaw: Supporting Investigative Analysis through Interactive Visualization

Jigsaw is a visual analytic system that represents documents and their entities visually in order to help analysts examine them more efficiently and develop theories about potential actions more quickly.

Review spotlight: a user interface for summarizing user-generated reviews using adjective-noun word pairs

Review Spotlight provides a brief overview of reviews using adjective-noun word pairs, and allows the user to quickly explore the reviews in greater detail, and shows that participants could form detailed impressions about restaurants and decide between two options significantly faster with Review Spotlight than with traditional review webpages.

Studying the History of Ideas Using Topic Models

Unsupervised topic modeling is applied to the ACL Anthology to analyze historical trends in the field of Computational Linguistics from 1978 to 2006, finding trends including the rise of probabilistic methods starting in 1988, a steady increase in applications, and a sharp decline of research in semantics and understanding between 1978 and 2001.