• Corpus ID: 14644892

Answer Extraction as Sequence Tagging with Tree Edit Distance

  title={Answer Extraction as Sequence Tagging with Tree Edit Distance},
  author={Xuchen Yao and Benjamin Van Durme and Chris Callison-Burch and Peter Clark},
Our goal is to extract answers from preretrieved sentences for Question Answering (QA). We construct a linear-chain Conditional Random Field based on pairs of questions and their possible answer sentences, learning the association between questions and answer types. This casts answer extraction as an answer sequence tagging problem for the first time, where knowledge of shared structure between question and source sentence is incorporated through features based on Tree Edit Distance (TED). Our… 

Figures and Tables from this paper

Feature-driven Question Answering With Natural Language Alignment

This dissertation proposes the idea of feature-driven QA, a machine learning framework that automatically produces rich features from linguistic annotations of answer fragments and encodes them in compact log-linear models.

Automatic Feature Engineering for Answer Selection and Extraction

The results show that the models greatly improve on the state of the art, e.g., up to 22% on F1 (relative improvement) for answer extraction, while using no additional resources and no manual feature engineering.

Question Answering on Freebase via Relation Extraction and Textual Evidence

This work first presents a neural network based relation extractor to retrieve the candidate answers from Freebase, and then infer over Wikipedia to validate these answers, a substantial improvement over the state-of-the-art.

Deep Learning for Answer Sentence Selection

This work proposes a novel approach to solving the answer sentence selection task via means of distributed representations, and learns to match questions with answers by considering their semantic encoding.

Improved Representation Learning for Question Answer Matching

This work develops hybrid models that process the text using both convolutional and recurrent neural networks, combining the merits on extracting linguistic information from both structures to address passage answer selection.

A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering

The proposed method uses a stacked bidirectional Long-Short Term Memory network to sequentially read words from question and answer sentences, and then outputs their relevance scores, which outperforms previous work which requires syntactic features and external knowledge resources.

WikiQA: A Challenge Dataset for Open-Domain Question Answering

The WIKIQA dataset is described, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering, which is more than an order of magnitude larger than the previous dataset.

Multiple-Choice Question Answering Over Semi-Structured Tables

This thesis builds a QA system that can answer multiple-choice questions based on semi-structured tables, and achieves a huge improvement over the previous state-of-the-art system.

Shallow and Deep Syntactic/Semantic Structures for Passage Reranking in Question-Answering Systems

This article extensively study the use of syntactic and semantic structures obtained with shallow and full syntactic parsers for answer passage reranking and derived the following important findings: relational syntactic structures are essential to achieve superior results and models trained with dependency trees can outperform those trained with shallow trees.

Hybrid Question Answering over Knowledge Base and Free Text

This paper presents a hybrid question answering (hybrid-QA) system which exploits both structured knowledge base and free text to answer a question, and develops an integer linear program (ILP) model to infer on these candidates and provide a globally optimal solution.



Mapping Dependencies Trees: An Application to Question Answering

An approach for answer selection in a free form question answering task is described, representing both questions and candidate passages using dependency trees, and incorporating semantic information such as named entities in this representation.

Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering

This work captures the alignment by using a novel probabilistic model that models tree-edit operations on dependency parse trees and treats alignments as structured latent variables, and offers a principled framework for incorporating complex linguistic features.

Complex Cross-lingual Question Answering as a Sequential Classification and Multi-Document Summarization Task

The JAVELIN IV system, which treats complex question answering as a sequential classification and multi-document summarization task, is described and the use of different units of extraction, the effect of different syntactic features for classification, and theeffect of different summarization strategies are discussed.

Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions

A logistic regression model that uses 33 syntactic features of edit sequences to classify the sentence pairs and leads to competitive performance in recognizing textual entailment, paraphrase identification, and answer selection for question answering.

Patterns of Potential Answer Expressions as Clues to the Right Answers

The participation at TREC-10 was a test for some basic mechanisms of the text processing technology developed in the framework of the CrossReader project and these mechanisms will be implemented in the new TextRoller versions.

Is It the Right Answer? Exploiting Web Redundancy for Answer Validation

This work presents a novel approach to answer validation based on the intuition that the amount of implicit knowledge which connects an answer to a question can be quantitatively estimated by exploiting the redundancy of Web information.

What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA

A probabilistic quasi-synchronous grammar, inspired by one proposed for machine translation, and parameterized by mixtures of a robust nonlexical syntax/alignment model with a(n optional) lexical-semantics-driven log-linear model is proposed.

Testing the Reasoning for Question Answering Validation

This article quantifies and discusses the source of errors introduced by the reformulation of the answer validation problem in terms of textual entailment, and proposes an evaluation framework for AV linked to a QA evaluation track.

Exploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification

The experiments suggest that syntactic information helps tasks such as question/answer classification and that shallow semantics gives remarkable contribution when a reliable set of PASs can be extracted, e.g. from answers.

Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums

A general framework based on Conditional Random Fields (CRFs) to detect the contexts and answers of questions from forum threads is proposed and improved by Skip-chain CRFs and 2D CRFs to better accommodate the features of forums for better performance.