FeTaQA: Free-form Table Question Answering

@article{Nan2022FeTaQAFT,
  title={FeTaQA: Free-form Table Question Answering},
  author={Linyong Nan and Chia-Hsuan Hsieh and Ziming Mao and Xi Victoria Lin and Neha Verma and Rui Zhang and Wojciech Kryscinski and Nick Schoelkopf and Riley Kong and Xiangru Tang and Murori Mutuma and Benjamin Rosand and Isabel Trindade and Renusree Bandaru and Jacob Cunningham and Caiming Xiong and Dragomir Radev},
  journal={Transactions of the Association for Computational Linguistics},
  year={2022},
  volume={10},
  pages={35-49}
}
Existing table question answering datasets contain abundant factual questions that primarily evaluate a QA system’s comprehension of query and tabular data. However, restricted by their short-form answers, these datasets fail to include question–answer interactions that represent more advanced and naturally occurring information needs: questions that ask for reasoning and integration of information pieces retrieved from a structured knowledge source. To complement the existing datasets and to… 

Generation-Focused Table-Based Intermediate Pre-training for Free-Form Question Answering

TLDR
An intermediate pre-training framework, Generation-focused Table-based Intermediate Pre-training (GENTAP), that jointly learns representations of natural language questions and tables that enhance the question understanding and table representation abilities for complex questions is presented.

A Survey on Table Question Answering: Recent Advances

TLDR
An overview of available datasets and representative methods in table QA is provided, which includes semantic-parsing-based, generative, extractive, matching- based, and retriever-reader-based methods.

AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry

TLDR
This work introduces AIT-QA; a domain-specific Table QA test dataset, focusing on the airline industry, and provides data-driven insights on how different aspects of this setting affect TableQA methods, in order to help the community develop improved methods for domain- specificTable QA.

Parameter-Efficient Abstractive Question Answering over Tables or Text

TLDR
This work studies parameter-efficient abstractive QA in encoder-decoder models over structured tabular data and unstructured textual data using only 1.5% additional parameters for each modality and achieves comparable performance on a textual QA dataset such as NarrativeQA using significantly less trainable parameters than fine-tuning.

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

TLDR
This work presents two transformer-based models that combine visual features and the data table of the chart in a unified way to answer questions and achieves the state-of-the-art results on the previous datasets as well as on the benchmark.

Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills

TLDR
This work proposes to leverage semi-structured tables, and automatically generate at scale question-paragraph pairs, where answering the question requires reasoning over multiple facts in the paragraph, and shows that the model, PReasM, substantially outperforms T5, a popular pre-trained encoder-decoder model.

MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data

TLDR
A new large-scale benchmark, MultiHiertt, with QA pairs over Multi Hierarchical Tabular and Textual data is constructed and a novel QA model termed MT2Net is introduced, which first applies facts retrieving to extract relevant supporting facts from both tables and text and then uses a reasoning module to perform symbolic reasoning over retrieved facts.

DUE: End-to-End Document Understanding Benchmark

TLDR
The Document Understanding Evaluation (DUE) benchmark consisting of both available and reformulated datasets is intro-duce to measure the end-to-end capabilities of systems in real-world scenarios to empower research in the NLP research community.

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

TLDR
The UNIFIEDSKG framework is proposed, which unifies 21 SKG tasks into a text-to-text format, aiming to promote systematic SKG research, instead of being exclusive to a single task, domain, or dataset.

Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

TLDR
This work proposes a method to build synthetic multimodal corpora enabling to train multi-modal components for a data-QuestEval metric, which obtains state-of-the-art correlations with human judgment on the WebNLG and WikiBio benchmarks.

References

SHOWING 1-10 OF 49 REFERENCES

Open Question Answering over Tables and Text

TLDR
This work considers for the first time open QA over both tabular and textual data and presents a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

TLDR
It is shown that HotpotQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions.

MultiModalQA: Complex Question Answering over Text, Tables and Images

TLDR
This paper creates MMQA, a challenging question answering dataset that requires joint reasoning over text, tables and images, and defines a formal language that allows it to take questions that can be answered from a single modality, and combine them to generate cross-modal questions.

ELI5: Long Form Question Answering

TLDR
This work introduces the first large-scale corpus for long form question answering, a task requiring elaborate and in-depth answers to open-ended questions, and shows that an abstractive model trained with a multi-task objective outperforms conventional Seq2Seq, language modeling, as well as a strong extractive baseline.

Natural Questions: A Benchmark for Question Answering Research

TLDR
The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature.

Hurdles to Progress in Long-form Question Answering

TLDR
The task formulation raises fundamental challenges regarding evaluation and dataset creation that currently preclude meaningful modeling progress, and a new system that relies on sparse attention and contrastive retriever learning to achieve state-of-the-art performance on the ELI5 LFQA dataset is designed.

DART: Open-Domain Structured Data Record to Text Generation

TLDR
The dataset construction framework effectively merged heterogeneous sources from open domain semantic parsing and spoken dialogue systems by utilizing techniques including tree ontology annotation, question-answer pair to declarative sentence conversion, and predicate unification, all with minimum post-editing.

Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph

TLDR
The task of Complex Sequential QA is introduced which combines the two tasks of answering factual questions through complex inferencing over a realistic-sized KG of millions of entities, and learning to converse through a series of coherently linked QA pairs.

Search-based Neural Structured Learning for Sequential Question Answering

TLDR
This work proposes a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search that effectively leverages the sequential context to outperform state-of-the-art QA systems that are designed to answer highly complex questions.

KILT: a Benchmark for Knowledge Intensive Language Tasks

TLDR
It is found that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text.