Grounded Graph Decoding Improves Compositional Generalization in Question Answering

@article{Gai2021GroundedGD,
  title={Grounded Graph Decoding Improves Compositional Generalization in Question Answering},
  author={Yu Gai and Paras Jain and Wendi Zhang and Joseph Gonzalez and Dawn Xiaodong Song and Ion Stoica},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.03642}
}
Question answering models struggle to generalize to novel compositions of training patterns, such to longer sequences or more complex test structures. Current end-to-end models learn a flat input embedding which can lose input syntax context. Prior approaches improve generalization by learning permutation invariant models, but these methods do not scale to more complex train-test splits. We propose Grounded Graph Decoding, a method to improve compositional generalization of language… 

Figures and Tables from this paper

Compositional Generalization in Multilingual Semantic Parsing over Wikidata

A method is proposed for creating a multilingual, parallel dataset of question-query pairs, grounded in Wikidata, and it is used to analyze the compositional generalization of semantic parsers in Hebrew, Kannada, Chinese, and English.

Compositional Semantic Parsing with Large Language Models

The best method is based on least-to-most prompting: it decomposes the problem using prompting-based syntactic parsing, then uses this decomposition to select appropriate exemplars and to sequentially generate the semantic parse.

State-of-the-art generalisation research in NLP: a taxonomy and review

A taxonomy for characterising and understanding generalisation research in NLP is presented, a taxonomy is used to present a comprehensive map of published generalisation studies, and recommendations for which areas might deserve attention in the future are made.

References

SHOWING 1-10 OF 30 REFERENCES

Hierarchical Poset Decoding for Compositional Generalization in Language

A novel hierarchical poset decoding paradigm for compositional generalization in language that enforces partial permutation invariance in semantics, thus avoiding overfitting to bias ordering information and results show that it outperforms current decoders.

Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures

It is shown that masked language model (MLM) pre-training rivals SCAN-inspired architectures on primitive holdout splits and establishes a new state of the art on the CFQ compositional generalization benchmark using MLM pre- training together with an intermediate representation.

Unlocking Compositional Generalization in Pre-trained Models Using Intermediate Representations

It is highlighted that intermediate representations provide an important and potentially overlooked degree of freedom for improving the compositional generalization abilities of pre-trained seq2seq models.

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data

A novel method to systematically construct compositional generalization benchmarks by maximizing compound divergence while guaranteeing a small atom divergence between train and test sets is introduced, and it is demonstrated how this method can be used to create new compositionality benchmarks on top of the existing SCAN dataset.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

Measuring Compositionality in Representation Learning

This work describes a procedure for evaluating compositionality by measuring how well the true representation-producing model can be approximated by a model that explicitly composes a collection of inferred representational primitives.

Compositional Questions Do Not Necessitate Multi-hop Reasoning

This work introduces a single-hop BERT-based RC model that achieves 67 F1—comparable to state-of-the-art multi-hop models and designs an evaluation setting where humans are not shown all of the necessary paragraphs for the intendedmulti-hop reasoning but can still answer over 80% of questions.

AMR Parsing as Graph Prediction with Latent Alignment

A neural parser is introduced which treats alignments as latent variables within a joint probabilistic model of concepts, relations and alignments and shows that joint modeling is preferable to using a pipeline of align and parse.

Permutation Equivariant Models for Compositional Generalization in Language

This paper hypothesizes that language compositionality is a form of group-equivariance, and proposes a set of tools for constructing equivariant sequence-to-sequence models that are able to achieve the type compositional generalization required in human language understanding.

Compositional Generalization via Neural-Symbolic Stack Machines

The Neural-Symbolic Stack Machine (NeSS), a neural network to generate traces, which are then executed by a symbolic stack machine enhanced with sequence manipulation operations, achieves 100% generalization performance in four domains.