TaPas: Weakly Supervised Table Parsing via Pre-training

@article{Herzig2020TaPasWS,
  title={TaPas: Weakly Supervised Table Parsing via Pre-training},
  author={Jonathan Herzig and Pawel Krzysztof Nowak and Thomas M{\"u}ller and Francesco Piccinno and Julian Martin Eisenschlos},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.02349}
}
Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In this paper, we present TaPas, an approach to… 
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
TLDR
GraPPa is an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data and significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
TLDR
A model pre-training framework, GenerationAugmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data to mitigate issues of existing general-purpose language models.
Understanding tables with intermediate pre-training
TLDR
This work adapts TAPAS (Herzig et al., 2020), a table-based BERT model, to recognize entailment, and creates a balanced dataset of millions of automatically created training examples which are learned in an intermediate step prior to fine-tuning.
Weakly Supervised Semantic Parsing by Learning from Mistakes
TLDR
Learning from Mistakes (LFM), a simple yet effective learning framework for WSP, utilizes the mistakes made by a parser during searching, i.e., generating logical forms that do not execute to correct denotations for tackling the two challenges.
Translating Natural Language Questions to SQL Queries
TLDR
This work uses the pre-trained BERT-based TAPAS transformer model to encode more expressive table representations for the schema, in addition to the existing BiLSTM-based encodings, to improve a sequence-to-sequence dual-task learning model by generalizing better on a zero-shot testbed.
DoT: An efficient Double Transformer for NLP tasks with tables
TLDR
This work proposes a new architecture, DoT, a double transformer model, that decomposes the problem into two sub-tasks: A shallow pruning transformer that selects the top-K tokens, followed by a deep task-specific transformer that takes as input those K tokens.
Generation-focused Table-based Intermediate Pre-training for Free-form Question Answering
TLDR
An intermediate pre-training framework, Generation-focused Table-based Intermediate Pre-training (GENTAP), that jointly learns representations of natural language questions and tables that enhance the question understanding and table representation abilities for complex questions is presented.
Semantic Parsing with Less Prior and More Monolingual Data
TLDR
This work investigates whether a generic transformerbased seq2seq model can achieve competitive performance with minimal semantic-parsing specific inductive bias design, and achieves positive evidence of a potentially easier path toward building accurate semantic parsers in the wild.
Structured Context and High-Coverage Grammar for Conversational Question Answering over Knowledge Graphs
TLDR
A new Logical Form (LF) grammar is introduced that can model a wide range of queries on the graph while remaining sufficiently simple to generate supervision data efficiently, and this Transformer-based model takes a JSON-like structure as input, allowing it to easily incorporate both Knowledge Graph and conversational contexts.
Linking-Enhanced Pre-Training for Table Semantic Parsing
TLDR
Two novel pre-training objectives are designed to impose the desired inductive bias into the learned representations for table pre- Training and a schema-aware curriculum learning approach is proposed to mitigate the impact of noise and learn effectively from the pre- training data in an easy-to-hard manner.
...
...

References

SHOWING 1-10 OF 49 REFERENCES
Semantic Parsing on Freebase from Question-Answer Pairs
TLDR
This paper trains a semantic parser that scales up to Freebase and outperforms their state-of-the-art parser on the dataset of Cai and Yates (2013), despite not having annotated logical forms.
Compositional Semantic Parsing on Semi-Structured Tables
TLDR
This paper proposes a logical-form driven parsing algorithm guided by strong typing constraints and shows that it obtains significant improvements over natural baselines and is made publicly available.
Iterative Search for Weakly Supervised Semantic Parsing
TLDR
A novel iterative training algorithm is proposed that alternates between searching for consistent logical forms and maximizing the marginal likelihood of the retrieved ones, thus dealing with the problem of spuriousness.
Learning Semantic Parsers from Denotations with Latent Structured Alignments and Abstract Programs
TLDR
This work capitalize on the intuition that correct programs would likely respect certain structural constraints were they to be aligned to the question and propose to model alignments as structured latent variables as part of the latent-alignment framework.
Learning a Natural Language Interface with Neural Programmer
TLDR
This paper presents the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset, and enhances the objective function of Neural Programmer, a neural network with built-in discrete operations, and applies it on WikiTableQuestions, a natural language question-answering dataset.
Neural Semantic Parsing over Multiple Knowledge-bases
TLDR
This paper finds that it can substantially improve parsing accuracy by training a single sequence-to-sequence model over multiple KBs, when providing an encoding of the domain at decoding time.
Neural enquirer: learning to query tables in natural language
TLDR
The experiments show that Neural Enquirer can learn to execute fairly complicated NL queries on tables with rich structures, and is one step towards building neural network systems which seek to understand language by executing it on real-world.
Driving Semantic Parsing from the World’s Response
TLDR
This paper develops two novel learning algorithms capable of predicting complex structures which only rely on a binary feedback signal based on the context of an external world and reformulates the semantic parsing problem to reduce the dependency of the model on syntactic patterns, thus allowing the parser to scale better using less supervision.
Knowledge-Aware Conversational Semantic Parsing Over Web Tables
TLDR
A knowledge-aware semantic parser to improve parsing performance by integrating various types of knowledge, including grammar knowledge, expert knowledge, and external resource knowledge, which is based on a decomposable model.
Cross-domain Semantic Parsing via Paraphrasing
TLDR
By converting logical forms into canonical utterances in natural language, semantic parsing is reduced to paraphrasing, and an attentive sequence-to-sequence paraphrase model is developed that is general and flexible to adapt to different domains.
...
...