Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection

  title={Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection},
  author={Sooji Han and Jie Gao and Fabio Ciravegna},
  journal={2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)},
  • Sooji Han, Jie Gao, F. Ciravegna
  • Published 16 July 2019
  • Computer Science
  • 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
The scarcity and class imbalance of training data are known issues in current rumor detection tasks. [] Key Method A state-of-the-art neural language model (NLM) and large credibility-focused Twitter corpora are employed to learn context-sensitive representations of rumor tweets. Six different real-world events based on three publicly available rumor datasets are employed in our experiments to provide a comparative evaluation of the effectiveness of the method. The results show that our method can expand the…

Figures and Tables from this paper

RP-DNN: A Tweet Level Propagation Context Based Deep Neural Networks for Early Rumor Detection in Social Media
A novel hybrid neural network architecture is presented, which combines a task-specific character-based bidirectional language model and stacked Long Short-Term Memory networks to represent textual contents and social-temporal contexts of input source tweets, for modelling propagation patterns of rumors in the early stages of their development.
Adapting Pre-trained Language Models to Rumor Detection on Twitter
This paper proposes an approach that seeks to detect emerging and unseen rumors on Twitter by adapting a pre-trained language model to the task of rumor detection, namely RoBERTa, and shows that the approach outperforms state of the art ones in all metrics.
Identifying Possible Rumor Spreaders on Twitter: A Weak Supervised Learning Approach
This work uses publicly available PHEME dataset, which contains rumor and non-rumor tweets information, and explores Graph Convolutional Network (GCN), a type of Graph Neural Network (GNN) technique, and compares GCN results with the other approaches: SVM, RF, and LSTM.
A Graph Neural Network based approach for detecting Suspicious Users on Online Social Media
Graph Convolutional Network (GCN) is explored, which is a type of GNN, for identifying suspicious users and then GCN results are compared with the other three approaches which act as baseline approaches: SVM, RF and LSTM based deep learning architecture.
ARGH!: Automated Rumor Generation Hub
It is still challenging to effectively identify rumors due to rapid changes in people's interests and perceptions. To enhance rumor detectors, we first need to better understand which rumors are
The Impacts of the Contextual Substitutions in Vietnamese Micro-text Augmentation


Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection
A deep attention model based on recurrent neural networks to selectively learn temporal representations of sequential posts for rumor identification that outperforms state-of-the-art baselines by detecting rumors more quickly and accurately than competitors.
Detecting Rumors from Microblogs with Recurrent Neural Networks
A novel method that learns continuous representations of microblog events for identifying rumors based on recurrent neural networks that detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.
Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media
A novel approach to rumour detection that learns from the sequential dynamics of reporting during breaking news in social media to detect rumours in new stories and achieves competitive performance, beating the state-of-the-art classifier that relies on querying tweets with improved precision and recall, as well as outperforming the best baseline.
Exploiting Context for Rumour Detection in Social Media
A novel approach using Conditional Random Fields that learns from the sequential dynamics of social media posts with the current state-of-the-art rumour detection system, as well as other baselines, and results provide evidence for the generalisability of the classifier.
CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations
CREDBANK is a corpus of tweets, topics, events and associated human credibility judgements designed to bridge the gap between machine and human computation in online information credibility in fields such as social science, data mining and health.
Rumor Detection on Twitter with Tree-structured Recursive Neural Networks
This work proposes two recursive neural models based on a bottom-up and a top-down tree-structured neural networks for rumor representation learning and classification, which naturally conform to the propagation layout of tweets.
Do Rumors Diffuse Differently from Non-rumors? A Systematically Empirical Analysis in Sina Weibo for Rumor Identification
This paper systematically investigates the problem of automatically identifying rumors in social media from a diffusion perspective using Sina Weibo data, and develops classifiers to discriminate rumors and non-rumors.
Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts
A technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases, which ranks the clusters by their likelihood of really containing a disputed factual claim.
Rumor Detection over Varying Time Windows
This study determines the major difference between rumors and non-rumors and explores rumor classification performance levels over varying time windows—from the first three days to nearly two months.
Stance Classification in Out-of-Domain Rumours: A Case Study Around Mental Health Disorders
This study studies the performance stability when switching to the new domain of mental health disorders, and confirms that performance drops when the trained model is applied on a new domain, emphasising the differences in rumours across domains.