Dataset of Fake News Detection and Fact Verification: A Survey
@article{Murayama2021DatasetOF, title={Dataset of Fake News Detection and Fact Verification: A Survey}, author={Taichi Murayama}, journal={ArXiv}, year={2021}, volume={abs/2111.03299} }
The rapid increase in fake news, which causes significant damage to society, triggers many fake news related studies, including the development of fake news detection and fact verification techniques. The resources for these studies are mainly available as public datasets taken fromWeb data. We surveyed 118 datasets related to fake news research on a large scale from three perspectives: (1) fake news detection, (2) fact verification, and (3) other tasks; for example, the analysis of fake news…
10 Citations
Annotation-Scheme Reconstruction for “Fake News” and Japanese Fake News Dataset
- Computer ScienceLREC
- 2022
This work proposes a novel annotation scheme with fine-grained labeling based on detailed investigations of existing fake news datasets to capture these various aspects of fake news.
FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms
- Computer ScienceArXiv
- 2022
This paper constructs the largest Chinese short video dataset about fake news named FakeSV, which includes news content, user comments, and publisher comments simultaneously simultaneously and provides a new multimodal detection model named SV-FEND, which exploits the cross-modal correlations to select the most informative features and utilizes the social context information for de- tection.
Methods of Informational Trends Analytics and Fake News Detection on Twitter
- Computer ScienceArXiv
- 2022
Information trends caused by Russian invasion of Ukraine in 2022 year have been studied and the possible impact of informational trends on different companies working in Russia during this invasion is considered.
"This is Fake News": Characterizing the Spontaneous Debunking from Twitter Users to COVID-19 False Information
- Computer ScienceArXiv
- 2022
It is found that most fake tweets are left undebunked and spontaneous debunking is slower than other forms of responses, and exhibits partisanship in political topics.
Who Funds Misinformation? A Systematic Analysis of the Ad-related Profit Routines of Fake News sites
- BusinessArXiv
- 2022
Fake news is an age-old phenomenon, widely assumed to be associated with political propaganda published to sway public opinion. Yet, with the growth of social media it has become a lucrative business…
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
- Computer Science
- 2022
In this paper, we examine several methods of acquiring Czech data for automated fact-checking, which is a task commonly modeled as a classification of textual claim veracity w.r.t. a corpus of trusted…
CsFEVER and CTKFacts: Czech Datasets for Fact Verification
- Computer ScienceArXiv
- 2022
This paper presents two Czech datasets for fact verification for spurious cues, which are annotation patterns leading to model overfitting, and describes a method to automatically generate wider claim contexts (dictionaries) for non-hyperlinked corpora.
Detecting and classifying online health misinformation with ‘Content Similarity Measure (CSM)’ algorithm: an automated fact-checking-based approach
- Computer ScienceThe Journal of supercomputing
- 2023
An extensive analysis of the proposed algorithm compared with standard similarity measures and machine learning classifiers showed that the ‘content similarity score’ feature outperformed other features with an accuracy of 88.26%.
TA-WHI
- Computer ScienceInternational Journal of Software Science and Computational Intelligence
- 2023
A model named Text Analysis of Web-based Health Information (TA-WHI), based on an algorithm designed for this, categorizes health-related social media feeds into five categories: sufficient, fabricated, meaningful, advertisement, and misleading.
Statistical learning from Brazilian fake news
- Computer ScienceExpert Systems
- 2022
The results show that four variables are significant to explain fake news and the model achieved comparable results with state‐of‐the‐art, 0.941 F‐measure, for a single classifier while having the advantage of being a parsimonious model.
References
SHOWING 1-10 OF 243 REFERENCES
Fake news detection: a survey of evaluation datasets
- Computer SciencePeerJ Comput. Sci.
- 2021
This survey systematically review popular datasets for fake news detection by providing insights into the characteristics of each dataset and comparative analysis among them, along with a set of requirements for comparing and building new datasets.
Survey on Fake News Detection Techniques
- Business, Computer ScienceICIP 2020
- 2020
This survey comprehensively and systematically studies different methodologies in the detection of fake news in digital media and identifies and specifies fundamental theories in Machine Learning to facilitate and enhance the research offake news detection.
A Survey on Natural Language Processing for Fake News Detection
- Computer ScienceLREC
- 2020
The challenges involved in fake news detection are described and the task formulations, datasets and NLP solutions that have been developed for this task are compared, and the potentials and limitations of them are discussed.
Mitigation of Diachronic Bias in Fake News Detection Dataset
- Computer ScienceWNUT
- 2021
This study confirms the bias, especially proper nouns including person names, from the deviation of phrase appearances in each dataset and proposes masking methods using Wikidata to mitigate the influence of person names and validate whether they make fake news detection models robust through experiments with in-domain and out-of-domain data.
Combating Fake News: A Survey on Identification and Mitigation Techniques
- Computer ScienceArXiv
- 2019
This survey describes the modern-day problem of fake news and, in particular, highlights the technical challenges associated with it and comprehensively compile and summarize characteristic features of available datasets.
Fake News Detection using Temporal Features Extracted via Point Process
- Computer ScienceICWSM Workshops
- 2020
This paper proposes a novel multi-modal attention-based method, which includes linguistic and user features alongside temporal features, for detectingfake news from SNS posts by using a point process algorithm to identify fake news from real news.
Automatic Detection of Fake News
- Computer ScienceCOLING
- 2018
This paper introduces two novel datasets for the task of fake news detection, covering seven different news domains, and conducts a set of learning experiments to build accurate fake news detectors that can achieve accuracies of up to 76%.
Early Detection of Fake News by Utilizing the Credibility of News, Publishers, and Users based on Weakly Supervised Learning
- Computer ScienceCOLING
- 2020
A novel structure-aware multi-head attention network (SMAN), which combines the news content, publishing, and reposting relations of publishers and users, to jointly optimize the fake news detection and credibility prediction tasks and can detect fake news in 4 hours with over 91%, which is much faster than the state-of-the-art models.
FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media
- Computer ScienceBig Data
- 2020
A fake news data repository FakeNewsNet is presented, which contains two comprehensive data sets with diverse features in news content, social context, and spatiotemporal information, and is discussed for potential applications on fake news study on social media.