• Publications
  • Influence
A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
TLDR
The first public dataset of scientific peer reviews available for research purposes (PeerRead v1) is presented and it is shown that simple models can predict whether a paper is accepted with up to 21% error reduction compared to the majority baseline. Expand
Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue
TLDR
This work collects a goal-driven recommendation dialogue dataset (GoRecDial), which consists of 9,125 dialogue games and 81,260 conversation turns between pairs of human workers recommending movies to each other, and uses the dataset to develop an end-to-end dialogue system that can simultaneously converse and recommend. Expand
Earlier Isn't Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization
TLDR
While position exhibits substantial bias in news articles, this is not the case, for example, with academic papers and meeting minutes, and the empirical study shows that different types of summarization systems are composed of different degrees of the sub-aspects. Expand
AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples
TLDR
This work proposes knowledge-guided adversarial example generators for incorporating large lexical resources in entailment models via only a handful of rule templates, and proposes the first GAN-style approach for training it using a natural language example generator that iteratively adjusts based on the discriminator's performance. Expand
Detecting and Explaining Causes From Text For a Time Series Event
TLDR
This work proposes a novel method based on the Granger causality of time series between features extracted from text such as N-grams, topics, sentiments, and their composition to detect causal features from text. Expand
xSLUE: A Benchmark and Analysis Platform for Cross-Style Language Understanding and Evaluation
TLDR
This paper provides a benchmark corpus (xSLUE) with an online platform (this http URL) for cross-style language understanding and evaluation and shows that some styles are highly dependent on each other, and some domains are stylistically more diverse than others. Expand
GenAug: Data Augmentation for Finetuning Text Generators
TLDR
This paper proposes and evaluates various augmentation methods, including some that incorporate external knowledge, for finetuning GPT-2 on a subset of Yelp Reviews, and examines the relationship between the amount of augmentation and the quality of the generated text. Expand
News2Images: Automatically Summarizing News Articles into Image-Based Contents via Deep Learning
TLDR
A method for generating compact image-based contents from news documents (News2Image) that uses word embedding for document summarization and convolutional neural networks for sentence-to-image transformation to deliver the core contents of the news to users. Expand
Bridging Knowledge Gaps in Neural Entailment via Symbolic Models
TLDR
This work proposes a fact-level decomposition of the hypothesis, and proposes a knowledge lookup module for verifying the resulting sub-facts against both the textual premise and the structured KB of science facts. Expand
Posterior Calibrated Training on Sentence Classification Tasks
TLDR
It is shown that PosCal not only helps reduce the calibration error but also improve task performance by penalizing drops in performance of both objectives, and can be easily extendable to any types of classification tasks as a form of regularization term. Expand
...
1
2
3
4
...