The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers

@article{Buschek2021TheIO,
  title={The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers},
  author={Daniel Buschek and Martin Zurn and Malin Eiband},
  journal={Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems},
  year={2021}
}
We present an in-depth analysis of the impact of multi-word suggestion choices from a neural language model on user behaviour regarding input and text composition in email writing. Our study for the first time compares different numbers of parallel suggestions, and use by native and non-native English writers, to explore a trade-off of “efficiency vs ideation”, emerging from recent literature. We built a text editor prototype with a neural language model (GPT-2), refined in a prestudy with 30… 

Figures and Tables from this paper

Wordcraft: Story Writing With Large Language Models
TLDR
This work built Wordcraft, a text editor in which users collaborate with a generative language model to write a story, and shows that large language models enable novel co-writing experiences.
Inspiration through Observation: Demonstrating the Influence of Automatically Generated Text on Creative Writing
TLDR
Analysis of how observing examples of automatically generated text influences writing gives evidence of an “inspiration through observation” paradigm for humancomputer collaborative writing, through which human writing can be enhanced by text generation models without directly copying their output.
The Case for a Single Model that can Both Generate Continuations and Fill in the Blank
TLDR
This work shows how F IT B models can be easily tuned to allow for limited control over the length and word choice of the generation, and evaluates the feasibil-ity of using a single model to do both tasks.
CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities
TLDR
It is argued that by curating and analyzing large interaction datasets, the HCI community can foster more incisive examinations of LMs’ generative capabilities, and presents CoAuthor, a dataset designed for revealing GPT-3’s capabilities in assisting creative and argumentative writing.
TaleBrush: Sketching Stories with Generative Pretrained Language Models
TLDR
TaleBrush is introduced, a generative story ideation tool that uses line sketching interactions with a GPT-based language model for control and sensemaking of a protagonist’s fortune in co-created stories and a reflection on how Sketching interactions can facilitate the iterative human-AI co-creation process.
AI as an Active Writer: Interaction Strategies with Generated Text in Human-AI Collaborative Fiction Writing 56-65
TLDR
A web-based human-AI collaborative writing tool that allows writers to shorten, edit, summarize, and regenerate text produced by AI, and finds that users took inspiration from unexpected text generated by the machine, and expected reduced fluency and coherence in the machine text when allowed to edit the output.
Evaluating Human-AI Hybrid Conversational Systems with Chatbot Message Suggestions
TLDR
How and when AI chatbot suggestions can help people answer questions in hybrid conversational systems is revealed and it seems that users would not simply ignore poor suggestions and compose responses as they could without seeing the suggestions.
A Selective Summary of Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence
TLDR
This work reports how participants perform integrative leaps, by which they do cognitive work to integrate suggestions of varying semantic relevance into their developing stories, and interprets these findings, offering modeling and design recommendations for future creative writing support technologies.
Will AI Console Me when I Lose my Pet? Understanding Perceptions of AI-Mediated Email Writing
Large language models are increasingly mediating, modifying, and even generating messages for users, but the receivers of these messages may not be aware of the involvement of AI. To examine this
A decision model for designing NLP applications
TLDR
This position paper summarizes the progress of NLP applications, which shows parallel outputs from the NLP model at once to users and presents a decision model that can assist in deciding whether a given condition is suitable to show multiple outputs at once from theNLP model.
...
...

References

SHOWING 1-10 OF 84 REFERENCES
On Suggesting Phrases vs. Predicting Words for Mobile Text Composition
TLDR
A simple extension to the familiar mobile keyboard suggestion interface is introduced that presents phrase suggestions that can be accepted by a repeated-tap gesture and finds that phrases were interpreted as suggestions that affected the content of what participants wrote more than conventional single-word suggestions.
Counterfactual Language Model Adaptation for Suggesting Phrases
TLDR
It is found that even a simple language model can capture text characteristics that improve acceptability, and a counterfactual setting that permits offline training and evaluation is proposed.
User Interaction with Word Prediction: The Effects of Prediction Quality
TLDR
A study of two different word prediction methods compared against letter-by-letter entry at simulated AAC communication rates finds that word prediction systems can in fact speed communication rate and that a more accurate word prediction system can raise the communication rate higher than is explained by the additional accuracy of the system alone.
Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories
TLDR
Novel natural language models and design choices are suggested that may better support creative writing, as machine suggestions do not necessarily lead to better written artifacts.
Complementing text entry evaluations with a composition task
TLDR
The empirical results show that the composition task can serve as a valid complementary text entry evaluation method and provide a best-practice procedure for using composition tasks in text entry evaluations.
Effects of Language Modeling and its Personalization on Touchscreen Typing Performance
TLDR
It is shown for the first time at this scale that a combined spatial-language model reduces word error rate from a pre-model baseline of 38.4% down to 5.7%, and that LM personalization can improve this further to 4.6%.
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
TLDR
A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.
Pre-trained Models for Natural Language Processing: A Survey
TLDR
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
A Cost-Benefit Study of Text Entry Suggestion Interaction
TLDR
Results showed that although increasing the assertiveness of suggestions reduced the number of keyboard actions to enter text and was subjectively preferred, the costs of attending to and using the suggestions impaired average time performance.
...
...