Visualizing textual models with in-text and word-as-pixel highlighting 0.2in [width=7in]ldagraphic/ldagraphic.pdf figureA topic model's token-level posterior memberships P(zt|wt) shown as in-text annotation (§3) and word-as-pixel (§4) views, from a corpus of U.S. presidential State of the Union speeches. Speeches are concatenated, running in columns; top-left is 1946, bottom right is 2007. (This version shows a sample of tokens.) Demo: ` `%%%`#`&12_`__~~~rue

Abstract

We explore two techniques which use color to make sense of statistical text models. One method uses in-text annotations to illustrate a model’s view of particular tokens in particular documents. Another uses a high-level, “wordsas-pixels” graphic to display an entire corpus. Together, these methods offer both zoomed-in and zoomed-out perspectives into a model’s understanding of text. We show how these interconnected methods help diagnose a classifier’s poor performance on Twitter slang, and make sense of a topic model on historical political texts. 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY, USA. Copyright by the author(s).

1 Figure or Table

Cite this paper

@inproceedings{Handler2016VisualizingTM, title={Visualizing textual models with in-text and word-as-pixel highlighting 0.2in [width=7in]ldagraphic/ldagraphic.pdf figureA topic model's token-level posterior memberships P(zt|wt) shown as in-text annotation (§3) and word-as-pixel (§4) views, from a corpus of U.S. presidential State of the Union speeches. Speeches are concatenated, running in columns; top-left is 1946, bottom right is 2007. (This version shows a sample of tokens.) Demo: ` `%%%`#`&12_`__~~~rue}, author={Abram Handler and Su Lin Blodgett and Brendan O'Connor}, year={2016} }