Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models

@article{Strobelt2019Seq2seqVisAV,
  title={Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models},
  author={Hendrik Strobelt and Sebastian Gehrmann and Michael Behrisch and Adam Perer and Hanspeter Pfister and Alexander M. Rush},
  journal={IEEE Transactions on Visualization and Computer Graphics},
  year={2019},
  volume={25},
  pages={353-363}
}
Neural sequence-to-sequence models have proven to be accurate and robust for many sequence prediction tasks, and have become the standard approach for automatic translation of text. The models work with a five-stage blackbox pipeline that begins with encoding a source sequence to a vector space and then decoding out to a new target sequence. This process is now standard, but like many deep learning methods remains quite difficult to understand or debug. In this work, we present a visual… 
Debugging Sequence-to-Sequence Models with Seq2Seq-Vis
TLDR
Neural attention-based sequence-to-sequence models (seq2seq) have proven to be accurate and robust for many sequence prediction tasks, but the highly connected and high-dimensional internal representations pose a challenge for analysis and visualization tools.
A Gray Box Interpretable Visual Debugging Approach for Deep Sequence Learning Model
TLDR
A visual interactive web application, namely d-DeVIS, which helps to visualize the internal reasoning of the learning model which is trained on the audio data, and allows to perceive the behavior as well as to debug the model by interactively generating adversarial audio data point.
ProtoSteer: Steering Deep Sequence Model with Prototypes
TLDR
This work tackles the challenge of directly involving the domain experts to steer a deep sequence model without relying on model developers as intermediaries and demonstrates that involvements of domain users can help obtain more interpretable models with concise prototypes while retaining similar accuracy.
AttViz: Online exploration of self-attention for transparent neural language modeling
TLDR
This work proposes AttViz, an online toolkit for exploration of self-attention---real values associated with individual text tokens, offering novel visualizations of the attention heads and their aggregations with minimal effort, online.
Ablate, Variate, and Contemplate: Visual Analytics for Discovering Neural Architectures
TLDR
REMAP is a visual analytics tool that allows a model builder to discover a deep learning model quickly via exploration and rapid experimentation of neural network architectures through visual exploration and user-defined semi-automated searches through the model space.
GenNI: Human-AI Collaboration for Data-Backed Text Generation
TLDR
GenNI (Generation Negotiation Interface) is an interactive visual system for high-level human-AI collaboration in producing descriptive text that utilizes a deep learning model designed with explicit control states, without sacrificing the representation power of the deep learning models.
Dimension Reduction Approach for Interpretability of Sequence to Sequence Recurrent Neural Networks
TLDR
A dimension reduction approach to visualize and interpret the representation of the data byEncoder-decoder recurrent neural network models (Seq2Seq) and to apply proper orthogonal decomposition to their concatenation to compute a low-dimensional embedding for hidden state dynamics.
Understanding and Improving Hidden Representations for Neural Machine Translation
TLDR
This work proposes to regularize the layer-wise representations with all tree-induced tasks and design efficient approximation methods by selecting a few coarse-to-fine tasks for regularization to overcome the computational bottleneck resulting from the large number of regularization terms.
Neural Machine Translation
TLDR
A comprehensive treatment of the topic, ranging from introduction to neural networks, computation graphs, description of the currently dominant attentional sequence-to-sequence model, recent refinements, alternative architectures and challenges.
VisQA: X-raying Vision and Language Reasoning in Transformers
TLDR
The design of VisQA was motivated by well-known bias examples from the fields of deep learning and vision-language reasoning, and the work lead to a better understanding of bias exploitation of neural models for VQA, which resulted in an impact on its design and training through the proposition of a method for the transfer of reasoning patterns from an oracle model.
...
...

References

SHOWING 1-10 OF 55 REFERENCES
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Get To The Point: Summarization with Pointer-Generator Networks
TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks
TLDR
This work presents LSTMVis, a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics, and describes the domain, the different stakeholders, and their goals and tasks.
Convolutional Sequence to Sequence Learning
TLDR
This work introduces an architecture based entirely on convolutional neural networks, which outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT-French translation at an order of magnitude faster speed, both on GPU and CPU.
A causal framework for explaining the predictions of black-box sequence-to-sequence models
TLDR
The method returns an “explanation” consisting of groups of input-output tokens that are causally related that are inferred by querying the model with perturbed inputs, generating a graph over tokens from the responses, and solving a partitions problem to select the most relevant components.
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
TLDR
GNMT, Google's Neural Machine Translation system, is presented, which attempts to address many of the weaknesses of conventional phrase-based translation systems and provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delicited models.
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.
Attention is All you Need
TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
A Deep Reinforced Model for Abstractive Summarization
TLDR
A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries.
A Convolutional Encoder Model for Neural Machine Translation
TLDR
A faster and simpler architecture based on a succession of convolutional layers that allows to encode the source sentence simultaneously compared to recurrent networks for which computation is constrained by temporal dependencies is presented.
...
...