Corpus Wide Argument Mining - a Working Solution

  title={Corpus Wide Argument Mining - a Working Solution},
  author={Liat Ein-Dor and Eyal Shnarch and Lena Dankin and Alon Halfon and Benjamin Sznajder and Ariel Gera and Carlos Alzate and Martin Gleize and Leshem Choshen and Yufang Hou and Yonatan Bilu and Ranit Aharonov and Noam Slonim},
  booktitle={AAAI Conference on Artificial Intelligence},
One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic. Most previous work addressed this task by retrieving a relatively small number of relevant documents as the initial source for such content. This line of research yielded moderate success, which is of limited use in a real-world system. Furthermore, for such a system to yield a comprehensive set of relevant arguments, over a wide range of topics, it requires leveraging a large and… 

Figures and Tables from this paper

On the Effect of Sample and Topic Sizes for Argument Mining Datasets

This work inquires whether it is necessary for acceptable performance of Argument Mining to have datasets growing in size or, if not, how smaller datasets have to be composed for optimal performance.

Diversity Aware Relevance Learning for Argument Search

This work introduces a new multi-step approach for the argument retrieval problem that employs a machine learning model to capture semantic relationships between arguments and leads to a significant improvement in the argument retrieved task even though it requires less data.

A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis

This work addresses the core issue of inducing a labeled score from crowd annotations by performing a comprehensive evaluation of different approaches to this problem, and presents a neural method for argument quality ranking, which outperforms several baselines on the authors' own dataset, as well as previous methods published for another dataset.

Argument Extraction for Key Point Generation Using MMR-Based Methods

Experimental results show that MoverScore-based MMR outperforms strong baselines covering 72.5% of arguments when eleven or more arguments are extracted, which is almost identical with the cover rate of human-made key points.

Multilingual Argument Mining: Datasets and Analysis

This work explores the potential of transfer learning using the multilingual BERT model to address argument mining tasks in non-English languages, based on English datasets and the use of machine translation, and focuses on the translate-train approach.

From Arguments to Key Points: Towards Automatic Argument Summarization

It is shown, by analyzing a large dataset of crowd-contributed arguments, that a small number of key points per topic is typically sufficient for covering the vast majority of the arguments and that a domain expert can often predict these key points in advance.

A Robustness Evaluation Framework for Argument Mining

A robustness evaluation framework to guide the design of rigorous argument mining models and it is argued that the framework should be used in conjunction with standard performance evaluation techniques as a measure of model stability.

Topic Ontologies for Arguments

This paper contributes the first comprehensive survey of topic coverage, assessing 45 argument corpora and shows that the corpora topics— which are mostly those frequently discussed in public online fora—are covered well by the sources.

Learning to Rank Arguments with Feature Selection

Five runs tackling argument retrieval with four different methodological paradigms are submitted, including a Dirichlet-smoothed language-model with filtering of low-quality arguments, a reranking approach that casts argumentretrieval as a question-answering (QA) task, and a transformer-based query expansion method that enriches the query with topically relevant keywords.

MultiOpEd: A Corpus of Multi-Perspective News Editorials

MultiOpEd, an open-domain news editorial corpus that supports various tasks pertaining to the argumentation structure in news editorials, focusing on automatic perspective discovery, is proposed and shown to improve the quality of the perspective summary generated.



Towards an argumentative content search engine using weak supervision

This work uses a weak signal to define weak signals for training DNNs to obtain significantly greater performance, and adapts the system to solve a recent argument mining task of identifying argumentative sentences in Web texts retrieved from heterogeneous sources, and obtain F1 scores comparable to the supervised baseline.

Unsupervised corpus–wide claim detection

This work derives a claim sentence query by which it is able to directly retrieve sentences in which the prior probability to include topic-relevant claims is greatly enhanced, leading to an unsupervised corpus–wide claim detection system with precision that outperforms previously reported results on the task of claim detection given relevant documents and labeled data.

Cross-topic Argument Mining from Heterogeneous Sources

This paper proposes a new sentential annotation scheme that is reliably applicable by crowd workers to arbitrary Web texts and shows that integrating topic information into bidirectional long short-term memory networks outperforms vanilla BiLSTMs in F1 in two- and three-label cross-topic settings.

Cross-Domain Mining of Argumentative Text through Distant Supervision

A distant supervision approach that acquires argumentative text segments automatically from online debate portals and freely provides the underlying corpus for research, showing that training on such a corpus improves the effectiveness and robustness of mining argumentativeText.

Five Years of Argument Mining: a Data-driven Analysis

This paper presents the argument mining tasks, and the obtained results in the area from a data-driven perspective, and highlights the main weaknesses suffered by the existing work in the literature, and proposes open challenges to be faced in the future.

Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM

This work annotates a large datasets of 16k pairs of arguments over 32 topics and investigates whether the relation “A is more convincing than B” exhibits properties of total ordering; these findings are used as global constraints for cleaning the crowdsourced data.

Context Dependent Claim Detection

This work formally defines the challenging task of automatic claim detection in a given context and outlines a preliminary solution, and assess its performance over annotated real world data, collected specifically for that purpose over hundreds of Wikipedia articles.

An Online Annotation Assistant for Argument Schemes

This work presents one step in the development of improved datasets by integrating the Argument Scheme Key – a novel annotation method based on one of the most popular typologies of argument schemes – into the widely used OVA software for argument analysis.

Identifying Appropriate Support for Propositions in Online User Comments

A framework for automatically classifying each proposition as UNVERIFIABLE, VERifIABLE NONEXPERIENTIAL, or VERIFIAble EXPERIENTial is developed, where the appropriate type of support is reason, evidence, and optional evidence, respectively.

End-to-End Argumentation Mining in Student Essays

This work presents the first results on end-to-end argument mining in student essays using a pipeline approach, and addresses error propagation inherent in the pipeline approach by performing joint inference over the outputs of the tasks in an Integer Linear Programming (ILP) framework.