• Publications
  • Influence
Topics to Avoid: Demoting Latent Confounds in Text Classification
TLDR
This work proposes a method that represents the latent topical confounds and a model which “unlearns” confounding features by predicting both the label of the input text and the confound; but it shows that this model generalizes better and learns features that are indicative of the writing style rather than the content.
Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs
TLDR
This work proposes a general technique for replacing the softmax layer with a continuous embedding layer, and introduces a novel probabilistic loss, and a training and inference procedure in which it generates a probability distribution over pre-trained word embeddings, instead of a multinomial distribution over the vocabulary obtained via softmax.
Earth Mover's Distance Pooling over Siamese LSTMs for Automatic Short Answer Grading
TLDR
A novel framework for ASAG is introduced by cascading three neural building blocks: Siamese bidirectional LSTMs applied to a model and a student answer, a novel pooling layer based on earth-mover distance (EMD) across all hidden states from both L STMs, and a flexible final regression layer to output scores.
Controlled Text Generation as Continuous Optimization with Multiple Constraints
TLDR
This work forms the decoding process as an optimization problem which allows for multiple attributes it aims to control to be easily incorporated as differentiable constraints to the optimization by relaxing this discrete optimization to a continuous one.
A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards
TLDR
An end-to-end cross-lingual text summarization model that uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language is proposed.
Machine Translation into Low-resource Language Varieties
TLDR
This work proposes a general framework to rapidly adapt MT systems to generate language varieties that are close to, but different from, the standard target language, using no parallel (source–variety) data.
Neural Abstractive Summarization with Structural Attention
TLDR
This work presents a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies within the sentences of a document and shows that the proposed model achieves significant improvement over the baseline in both single and multi-document summarization settings.
A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation
TLDR
This work presents syn-margin loss, a novel margin-based loss that uses a synthetic negative sample constructed from only the predicted and target embeddings at every step, and finds that it provides small but significant improvements over both vMF and standard margin- based losses in continuous-output neural machine translation.
End-to-End Differentiable GANs for Text Generation
TLDR
While this approach, without any pretraining is more stable while training and outperforms other GAN based approaches, it still falls behind MLE, it is found that this gap is due to autoregressive nature and architectural requirements for text generation as well as a fundamental difference between the definition of Wasserstein distance in image and text domains.
Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces
TLDR
This work proposes a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function; and generates samples by initializing the entire output sequence with noise and following a Markov chain defined by Langevin Dynamics using the gradients of this energy.
...
...