Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children's mindreading ability

@inproceedings{Kovatchev2021CanVR,
  title={Can vectors read minds better than experts? Comparing data augmentation strategies for the automated scoring of children's mindreading ability},
  author={Venelin Kovatchev and Phillip Smith and Mark G. Lee and Rory T. Devine},
  booktitle={ACL/IJCNLP},
  year={2021}
}
In this paper we implement and compare 7 different data augmentation strategies for the task of automatic scoring of children’s ability to understand others’ thoughts, feelings, and desires (or “mindreading”). We recruit in-domain experts to re-annotate augmented samples and determine to what extent each strategy preserves the original rating. We also carry out multiple experiments to measure how much each augmentation strategy improves the performance of automatic scoring systems. To determine… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 30 REFERENCES
Syntactic Data Augmentation Increases Robustness to Inference Heuristics
TLDR
The best-performing augmentation method, subject/object inversion, improved BERT’s accuracy on controlled examples that diagnose sensitivity to word order from 0.28 to 0.73, suggesting that augmentation causes BERT to recruit abstract syntactic representations. Expand
“What is on your mind?” Automated Scoring of Mindreading in Childhood and Early Adolescence
TLDR
The first work on the automated scoring of mindreading ability in middle childhood and early adolescence is presented, demonstrating the applicability of state-of-the-art NLP solutions to a new domain and task. Expand
Soft Contextual Data Augmentation for Neural Machine Translation
TLDR
This work softly augments a randomly chosen word in a sentence by its contextual mixture of multiple related words, replacing the one-hot representation of a word by a distribution (provided by a language model) over the vocabulary. Expand
Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding
TLDR
A sequence-to-sequence generation based data augmentation framework that leverages one utterance’s same semantic alternatives in the training data to produce diverse utterances that help to improve the language understanding module. Expand
Improving short text classification through global augmentation methods
TLDR
The effect of different approaches to text augmentation is studied to provide insights for practitioners and researchers on making choices for augmentation for classification use cases and the use of \emph{mixup} further improves performance of all text based augmentations and reduces the effects of overfitting on a tested deep learning model. Expand
A Qualitative Evaluation Framework for Paraphrase Identification
TLDR
A new approach for the evaluation, error analysis, and interpretation of supervised and unsupervised Paraphrase Identification (PI) systems using a PI corpus annotated with linguistic phenomena, which allows for a qualitative evaluation and comparison of the PI models using human interpretable categories. Expand
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
TLDR
This work retrofit a language model with a label-conditional architecture, which allows the model to augment sentences without breaking the label-compatibility and improves classifiers based on the convolutional or recurrent neural networks. Expand
An advanced test of theory of mind: Understanding of story characters' thoughts and feelings by able autistic, mentally handicapped, and normal children and adults
  • F. Happé
  • Psychology, Medicine
  • Journal of autism and developmental disorders
  • 1994
TLDR
Autistic subjects were impaired at providing context-appropriate mental state explanations for the story characters' nonliteral utterances, compared to normal and mentally handicapped controls. Expand
GloVe: Global Vectors for Word Representation
TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure. Expand
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
TLDR
EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion, which shows that EDA improves performance for both convolutional and recurrent neural networks. Expand
...
1
2
3
...