TaPas: Weakly Supervised Table Parsing via Pre-training
@article{Herzig2020TaPasWS, title={TaPas: Weakly Supervised Table Parsing via Pre-training}, author={Jonathan Herzig and Pawel Krzysztof Nowak and Thomas M{\"u}ller and Francesco Piccinno and Julian Martin Eisenschlos}, journal={ArXiv}, year={2020}, volume={abs/2004.02349} }
Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In this paper, we present TaPas, an approach to…
Figures and Tables from this paper
170 Citations
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
- Computer ScienceICLR
- 2021
GraPPa is an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data and significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
- Computer ScienceAAAI
- 2021
A model pre-training framework, GenerationAugmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data to mitigate issues of existing general-purpose language models.
Understanding tables with intermediate pre-training
- Computer ScienceFINDINGS
- 2020
This work adapts TAPAS (Herzig et al., 2020), a table-based BERT model, to recognize entailment, and creates a balanced dataset of millions of automatically created training examples which are learned in an intermediate step prior to fine-tuning.
Weakly Supervised Semantic Parsing by Learning from Mistakes
- Computer ScienceEMNLP
- 2021
Learning from Mistakes (LFM), a simple yet effective learning framework for WSP, utilizes the mistakes made by a parser during searching, i.e., generating logical forms that do not execute to correct denotations for tackling the two challenges.
Translating Natural Language Questions to SQL Queries
- Computer Science
- 2021
This work uses the pre-trained BERT-based TAPAS transformer model to encode more expressive table representations for the schema, in addition to the existing BiLSTM-based encodings, to improve a sequence-to-sequence dual-task learning model by generalizing better on a zero-shot testbed.
DoT: An efficient Double Transformer for NLP tasks with tables
- Computer ScienceFINDINGS
- 2021
This work proposes a new architecture, DoT, a double transformer model, that decomposes the problem into two sub-tasks: A shallow pruning transformer that selects the top-K tokens, followed by a deep task-specific transformer that takes as input those K tokens.
Generation-focused Table-based Intermediate Pre-training for Free-form Question Answering
- Computer Science
- 2022
An intermediate pre-training framework, Generation-focused Table-based Intermediate Pre-training (GENTAP), that jointly learns representations of natural language questions and tables that enhance the question understanding and table representation abilities for complex questions is presented.
Semantic Parsing with Less Prior and More Monolingual Data
- Computer ScienceArXiv
- 2021
This work investigates whether a generic transformerbased seq2seq model can achieve competitive performance with minimal semantic-parsing specific inductive bias design, and achieves positive evidence of a potentially easier path toward building accurate semantic parsers in the wild.
Structured Context and High-Coverage Grammar for Conversational Question Answering over Knowledge Graphs
- Computer ScienceEMNLP
- 2021
A new Logical Form (LF) grammar is introduced that can model a wide range of queries on the graph while remaining sufficiently simple to generate supervision data efficiently, and this Transformer-based model takes a JSON-like structure as input, allowing it to easily incorporate both Knowledge Graph and conversational contexts.
Linking-Enhanced Pre-Training for Table Semantic Parsing
- Computer Science
- 2021
Two novel pre-training objectives are designed to impose the desired inductive bias into the learned representations for table pre- Training and a schema-aware curriculum learning approach is proposed to mitigate the impact of noise and learn effectively from the pre- training data in an easy-to-hard manner.
References
SHOWING 1-10 OF 49 REFERENCES
Semantic Parsing on Freebase from Question-Answer Pairs
- Computer ScienceEMNLP
- 2013
This paper trains a semantic parser that scales up to Freebase and outperforms their state-of-the-art parser on the dataset of Cai and Yates (2013), despite not having annotated logical forms.
Compositional Semantic Parsing on Semi-Structured Tables
- Computer ScienceACL
- 2015
This paper proposes a logical-form driven parsing algorithm guided by strong typing constraints and shows that it obtains significant improvements over natural baselines and is made publicly available.
Iterative Search for Weakly Supervised Semantic Parsing
- Computer ScienceNAACL
- 2019
A novel iterative training algorithm is proposed that alternates between searching for consistent logical forms and maximizing the marginal likelihood of the retrieved ones, thus dealing with the problem of spuriousness.
Learning Semantic Parsers from Denotations with Latent Structured Alignments and Abstract Programs
- Computer ScienceEMNLP
- 2019
This work capitalize on the intuition that correct programs would likely respect certain structural constraints were they to be aligned to the question and propose to model alignments as structured latent variables as part of the latent-alignment framework.
Learning a Natural Language Interface with Neural Programmer
- Computer ScienceICLR
- 2017
This paper presents the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset, and enhances the objective function of Neural Programmer, a neural network with built-in discrete operations, and applies it on WikiTableQuestions, a natural language question-answering dataset.
Neural Semantic Parsing over Multiple Knowledge-bases
- Computer ScienceACL
- 2017
This paper finds that it can substantially improve parsing accuracy by training a single sequence-to-sequence model over multiple KBs, when providing an encoding of the domain at decoding time.
Neural enquirer: learning to query tables in natural language
- Computer ScienceIEEE Data Eng. Bull.
- 2016
The experiments show that Neural Enquirer can learn to execute fairly complicated NL queries on tables with rich structures, and is one step towards building neural network systems which seek to understand language by executing it on real-world.
Driving Semantic Parsing from the World’s Response
- Computer ScienceCoNLL
- 2010
This paper develops two novel learning algorithms capable of predicting complex structures which only rely on a binary feedback signal based on the context of an external world and reformulates the semantic parsing problem to reduce the dependency of the model on syntactic patterns, thus allowing the parser to scale better using less supervision.
Knowledge-Aware Conversational Semantic Parsing Over Web Tables
- Computer ScienceNLPCC
- 2019
A knowledge-aware semantic parser to improve parsing performance by integrating various types of knowledge, including grammar knowledge, expert knowledge, and external resource knowledge, which is based on a decomposable model.
Cross-domain Semantic Parsing via Paraphrasing
- Computer ScienceEMNLP
- 2017
By converting logical forms into canonical utterances in natural language, semantic parsing is reduced to paraphrasing, and an attentive sequence-to-sequence paraphrase model is developed that is general and flexible to adapt to different domains.