• Publications
  • Influence
A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
TLDR
The first public dataset of scientific peer reviews available for research purposes (PeerRead v1) is presented and it is shown that simple models can predict whether a paper is accepted with up to 21% error reduction compared to the majority baseline. Expand
Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue
TLDR
This work collects a goal-driven recommendation dialogue dataset (GoRecDial), which consists of 9,125 dialogue games and 81,260 conversation turns between pairs of human workers recommending movies to each other, and uses the dataset to develop an end-to-end dialogue system that can simultaneously converse and recommend. Expand
Earlier Isn't Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization
TLDR
While position exhibits substantial bias in news articles, this is not the case, for example, with academic papers and meeting minutes, and the empirical study shows that different types of summarization systems are composed of different degrees of the sub-aspects. Expand
AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples
TLDR
This work proposes knowledge-guided adversarial example generators for incorporating large lexical resources in entailment models via only a handful of rule templates and proposes the first GAN-style approach for training it using a natural language example generator that iteratively adjusts to the discriminator’s weaknesses. Expand
Detecting and Explaining Causes From Text For a Time Series Event
TLDR
This work proposes a novel method based on the Granger causality of time series between features extracted from text such as N-grams, topics, sentiments, and their composition to detect causal features from text. Expand
xSLUE: A Benchmark and Analysis Platform for Cross-Style Language Understanding and Evaluation
TLDR
This paper provides a benchmark corpus (xSLUE) with an online platform (this http URL) for cross-style language understanding and evaluation and shows that some styles are highly dependent on each other, and some domains are stylistically more diverse than others. Expand
Style is NOT a single variable: Case Studies for Cross-Stylistic Language Understanding
TLDR
This paper provides the benchmark corpus (XSLUE) that combines existing datasets and collects a new one for sentence-level cross-style language understanding and evaluation and finds that combinations of some contradictive styles likely generate stylistically less appropriate text. Expand
INSPIRED: Toward Sociable Recommendation Dialog Systems
TLDR
This work designs an annotation scheme related to recommendation strategies based on social science theories and annotate these dialogs, and shows that sociable recommendation strategies, such as sharing personal opinions or communicating with encouragement, more frequently lead to successful recommendations. Expand
(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas
TLDR
PASTEL, the parallel and annotated stylistic language dataset, that contains ~41K parallel sentences (8.3K parallel stories) annotated across different personas, is released and a simple supervised model with the authors' parallel text outperforms the unsupervised models using nonparallel text in style transfer. Expand
Posterior Calibrated Training on Sentence Classification Tasks
TLDR
It is shown that PosCal not only helps reduce the calibration error but also improve task performance by penalizing drops in performance of both objectives, and can be easily extendable to any types of classification tasks as a form of regularization term. Expand
...
1
2
3
4
...