Finding Opinion Manipulation Trolls in News Community Forums

@inproceedings{Mihaylov2015FindingOM,
  title={Finding Opinion Manipulation Trolls in News Community Forums},
  author={Todor Mihaylov and Georgi Georgiev and Preslav Nakov},
  booktitle={CoNLL},
  year={2015}
}
The emergence of user forums in electronic news media has given rise to the proliferation of opinion manipulation trolls. Finding such trolls automatically is a hard task, as there is no easy way to recognize or even to define what they are; this also makes it hard to get training and testing data. We solve this issue pragmatically: we assume that a user who is called a troll by several people is likely to be one. We experiment with different variations of this definition, and in each case we… 

Figures and Tables from this paper

Hunting for Troll Comments in News Community Forums
TLDR
In this work, two classifiers are built that can distinguish a post by such a paid troll from one by a non-troll with 81-82% accuracy; the same classifier achieves 81- 82% accuracy on so called mentioned troll vs. non-Troll posts.
A Trolling Hierarchy in Social Media and A Conditional Random Field For Trolling Detection
TLDR
A comprehensive categorization of the trolling phenomena resource is presented, inspired by politeness research and a model that jointly predicts four crucial aspects of trolling: intention, interpretation, intention disclosure and response strategy is proposed.
Stance Detection in Fake News A Combined Feature Representation
TLDR
This paper presents an approach that combines lexical, word embeddings and n-gram features to detect the stance in fake news and investigates the importance of different lexicons in the detection of the classification labels.
TrollsWithOpinion: A Dataset for Predicting Domain-specific Opinion Manipulation in Troll Memes
TLDR
This work enhanced an existing dataset by annotating the data with the authors' defined classes, resulting in a dataset of 8,881 IWT or multimodal troll memes in the English language (TrollsWithOpinion dataset), and shows that existing state-of-the-art techniques could only reach a weighted-average F1-score of 0.37.
Modeling Trolling in Social Media Conversations
TLDR
A trolling categorization that is novel in the sense that it allows comment-based analysis from both the trolls' and the responders' perspectives, characterizing these two perspectives using four aspects, namely, the troll's intention and his intention disclosure, as well as the responder's interpretation of the Trolls' intention and her response strategy.
Do Not Trust the Trolls: Predicting Credibility in Community Question Answering Forums
TLDR
The problem is motivated and a publicly available annotated English corpus is created by crowdsourcing and a large set of features to predict the credibility of the answers are proposed, showing that the credibility labels can be predicted with high performance according to several standard IR ranking metrics.
From Royals to Vegans: Characterizing Question Trolling on a Community Question Answering Website
TLDR
This paper identifies a set of over 400,000 troll questions on Yahoo Answers aimed to inflame, upset, and draw attention from others on the community, and reveals unique characteristics of troll questions when compared to "regular" questions, with regards to their metadata, text, and askers.
In Search of Credible News
TLDR
This work proposes a language-independent approach for automatically distinguishing credible from fake news, based on a rich feature set that uses linguistic, credibility-related, and semantic features from four online sources.
Predicting the Role of Political Trolls in Social Media
TLDR
Experiments on the “IRA Russian Troll” dataset show that the methodology improves over the state-of-the-art in the first scenario, while providing a compelling case for the second scenario, which has not been explored in the literature thus far.
SALSA: Detection of Cybertrolls using Sentiment, Aggression, Lexical and Syntactic Analysis of Tweets
TLDR
From experiments and analysis, it was shown that sentiment, aggression, lexical, and syntactic textual features are indeed sufficient for a classifier to perform well in detecting whether or not a tweet was meant to troll.
...
...

References

SHOWING 1-10 OF 20 REFERENCES
Do Not Feel The Trolls
TLDR
The aim of this work is to use sentic computing, a new paradigm for the affective analysis of natural language text, to detect trolls and hence prevent web-users from being emotionally hurt by malicious posts.
Filtering Offensive Language in Online Communities using Grammatical Relations
TLDR
This paper analyzes the Offensive language in text messages posted in online communities, and proposes a new automatic sentence-level filtering approach that is able to semantically remove the offensive language by utilizing the grammatical relations among words.
Accurately detecting trolls in Slashdot Zoo via decluttering
TLDR
A general algorithm called TIA (short for Troll Identification Algorithm) to classify users of an online “signed” social network as malicious (e.g. trolls on Slashdot) or benign (i.e. normal honest users).
Mining the peanut gallery: opinion extraction and semantic classification of product reviews
TLDR
This work develops a method for automatically distinguishing between positive and negative reviews and draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation.
Spatio-temporal grounding of claims made on the web
TLDR
This project note introduces the spatio-temporal challenges and planned semantic annotation activities that are part of the PHEME project.
Detecting Offensive Language in Social Media to Protect Adolescent Online Safety
  • Ying Chen, Yilu Zhou, Sencun Zhu, Heng Xu
  • Computer Science
    2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing
  • 2012
TLDR
This work proposes the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media, and incorporates a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content.
Mining and summarizing customer reviews
TLDR
This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.
Machine learning in automated text categorization
TLDR
This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Adversarial Web Search
TLDR
It is shown that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware, and work over the past decade or so that aims to discover such spamming activities is examined, demonstrating that this conflict is far from over.
...
...