Share This Author
Evaluation of Text Generation: A Survey
This paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models.
Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories
Novel natural language models and design choices are suggested that may better support creative writing, as machine suggestions do not necessarily lead to better written artifacts.
Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts
This work introduces methods based on sentence mover’s similarity, and finds that sentence-based metrics correlate with human judgments significantly better than ROUGE, both on machine-generated summaries and human-authored essays.
Counterfactual Story Reasoning and Generation
- Lianhui Qin, Antoine Bosselut, Ari Holtzman, Chandra Bhagavatula, Elizabeth Clark, Yejin Choi
- Computer ScienceEMNLP
- 9 September 2019
This paper proposes Counterfactual Story Rewriting: given an original story and an intervening counterfactual event, the task is to minimally revise the story to make it compatible with the given counterfactually event.
Sounding Board: A User-Centric and Content-Driven Social Chatbot
The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design.
Tweet! - And I Can Tell How Many Followers You Have
An approach to predict the follower counts of Twitter users by looking at a small amount of their tweets is reported, and a pattern of textual features that demonstrates the correlation between Twitter specific communication and the number of followers is found.
Evaluating Machines by their Real-World Language Use
- Rowan Zellers, Ari Holtzman, Elizabeth Clark, Lianhui Qin, Ali Farhadi, Yejin Choi
- Computer ScienceArXiv
- 7 April 2020
This work proposes to evaluate machines by their success at real-world language use -- which greatly expands the scope of language tasks that can be measured and studied, and introduces TuringAdvice, a new challenge for language understanding systems.
Exploring the Effect of Author and Reader Identity in Online Story Writing: the STORIESINTHEWILD Corpus.
Compared to younger readers, readers age 45 and older consider stories significantly less creative and less entertaining, and suggest that reader and writer demographics, as well as writing setup, should be accounted for in story writing evaluations.
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
The new version of the Generation, Evaluation, and Metrics Benchmark introduces GEMv2, which introduces a modular infrastructure for dataset, model, and metric developers to beneﬁt from each others work.
Event-centric Context Modeling: The Case of Story Comprehension and Story Generation
In this opinion piece, we argue that there is a need for alternative design directions to complement existing AI efforts in narrative and character generation and algorithm development. To make our…