DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention

  title={DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention},
  author={Aniketh Janardhan Reddy and Gil Rocha and Diego Esteves},
In this paper, we describe DeFactoNLP, the system we designed for the FEVER 2018 Shared Task. The aim of this task was to conceive a system that can not only automatically assess the veracity of a claim but also retrieve evidence supporting this assessment from Wikipedia. In our approach, the Wikipedia documents whose Term Frequency-Inverse Document Frequency (TFIDF) vectors are most similar to the vector of the claim and those documents whose names are similar to those of the named entities… Expand
An Automated Fact Checking System Using Deep Learning Through Word Embedding
This paper examines how to use deep learning method to improve the performance of the automatic fact verification system and proposes a system by deep learning which can help people identify the authenticity of most claims as well as providing evidences selected from knowledge source like Wikipedia. Expand
SimpleLSTM: A Deep-Learning Approach to Simple-Claims Classification
An overview of recent approaches to Automated Fact-Checking is given, emphasizing on the key challenges faced during the development of such frameworks, and a new model dubbed SimpleLSTM is introduced, which outperforms baselines by 11%, 10.2% and 18.7% on FEVER-Support,FEVER-Reject and 3-Class datasets respectively. Expand
Validation of Facts Against Textual Sources
A system which would verify a claim against a source and classify the claim to be true, false, out-of-context or an inappropriate claim with respect to the textual source provided to the system is proposed. Expand
A Deep-Learning Approach to Simple-Claims Classification 3 tries to define the problem of fact-checking as a one-toone automation mapping of the human fact-checking process
The information on the internet suffers from noise and corrupt knowledge that may arise due to human and mechanical errors. To further exacerbate this problem, an ever-increasing amount of fake newsExpand
An Open-Domain Web Search Engine for Answering Comparative Questions
We present an open-domain web search engine that can help answer comparative questions like ”Is X better than Y for Z?” by providing argumentative documents. Building such a system requires multipleExpand
Automatic Fake News Detection: Are Models Learning to Reason?
Surprisingly, it is found on political fact checking datasets that most often the highest effectiveness is obtained by utilizing only the evidence, as the impact of including the claim is either negligible or harmful to the effectiveness. Expand
Detecting Fake News with Tweets’ Properties
  • Ning Xin Nyow, Hui Na Chua
  • Computer Science
  • 2019 IEEE Conference on Application, Information and Network Security (AINS)
  • 2019
The mechanisms of identifying the significant Tweets’ attributes and application architecture to systematically automate the classification of an online news are presented and social media Twitter’s data is derived to identify additional significant attributes that influence the accuracy of machine learning methods to classify if a news is real or fake. Expand
WhatTheWikiFact: Fact-Checking Claims Against Wikipedia
WhatTheWikiFact, a system for automatic claim verification using Wikipedia, predicts the veracity of an input claim, and it further shows the evidence it has retrieved as part of the verification process. Expand


FEVER: a large-scale dataset for Fact Extraction and VERification
This paper introduces a new publicly available dataset for verification against textual sources, FEVER, which consists of 185,445 claims generated by altering sentences extracted from Wikipedia and subsequently verified without knowledge of the sentence they were derived from. Expand
DeFacto - Temporal and multilingual Deep Fact Validation
DeFacto (Deep Fact Validation)-an algorithm able to validate facts by finding trustworthy sources for them on the Web by supplying the user with relevant excerpts of web pages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. Expand
Reading Wikipedia to Answer Open-Domain Questions
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts. Expand
Automated Fact Checking: Task Formulations, Methods and Future Directions
This paper surveys automated fact checking research stemming from natural language processing and related disciplines, unifying the task formulations and methodologies across papers and authors, and highlights the use of evidence as an important distinguishing factor among them cutting across task formulation and methods. Expand
A large annotated corpus for learning natural language inference
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time. Expand
Toward Veracity Assessment in RDF Knowledge Bases: An Exploratory Analysis
This article focuses on answering questions related to the assessment of the veracity of facts through Deep Fact Validation (DeFacto), a triple validation framework designed to assess facts in RDF knowledge bases and conducts a thorough analysis of its pipeline, aiming at reducing the error propagation through its components. Expand
Deep Contextualized Word Representations
A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals. Expand
A Decomposable Attention Model for Natural Language Inference
We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it triviallyExpand
Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling
By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate non-local structure while preserving tractable inference. Expand
Learning from Imbalanced Data
A critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario is provided. Expand