Corpus ID: 1670839

Heuristic Feature Selection for Clickbait Detection

@article{Wiegmann2018HeuristicFS,
  title={Heuristic Feature Selection for Clickbait Detection},
  author={Matti Wiegmann and Michael V{\"o}lske and Benno Stein and Matthias Hagen and Martin Potthast},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.01191}
}
We study feature selection as a means to optimize the baseline clickbait detector employed at the Clickbait Challenge 2017. The challenge's task is to score the "clickbaitiness" of a given Twitter tweet on a scale from 0 (no clickbait) to 1 (strong clickbait). Unlike most other approaches submitted to the challenge, the baseline approach is based on manual feature engineering and does not compete out of the box with many of the deep learning-based approaches. We show that scaling up feature… Expand
The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength
TLDR
The Clickbait Challenge 2017 was a shared task inviting the submission of clickbait detectors for a comparative evaluation, and a total of 13 detectors have been submitted, achieving significant improvements over the previous state of the art in terms of detection performance. Expand
A Web Service that Catches Clickbaits on News Articles
  • H. Sevinc
  • Computer Science
  • 2019 27th Signal Processing and Communications Applications Conference (SIU)
  • 2019
TLDR
To open public usage of this classifier, a web service has been created with the model and a client can classify any news article as clickbait or non-clickbait by providing title and body paragraphs of article. Expand
Tabloids in the Era of Social Media?
TLDR
Comparing the advent of clickbaits with the rise of tabloidization of news, this work brings out several important insights regarding the news consumers as well as the media organizations promoting news stories on Twitter. Expand
“We already think of the marketing behind it.” An exploratory qualitative research of the online sharing behaviour of consumers, and the implications for viral marketing.
This research aimed to find out more about virality as a whole, the sharing motivations of social media users, and what this means for viral marketing. Current literature has not yet discussedExpand

References

SHOWING 1-9 OF 9 REFERENCES
The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength
TLDR
The Clickbait Challenge 2017 was a shared task inviting the submission of clickbait detectors for a comparative evaluation, and a total of 13 detectors have been submitted, achieving significant improvements over the previous state of the art in terms of detection performance. Expand
Clickbait Detection
TLDR
This research proves that it is possible to identify clickbaits using all parts of the post while having minimum number of features possible. Expand
Clickbait Detection
TLDR
This paper proposes a new model for the detection of clickbait, i.e., short messages that lure readers to click a link, based on 215 features that enables a random forest classifier to achieve 0.79 ROC-AUC at 0.76 precision and0.76 recall. Expand
Crowdsourcing a Large Corpus of Clickbait on Twitter
TLDR
A new corpus of 38,517 annotated Twitter tweets, the Webis Clickbait Corpus 2017, is constructed to address the urging task of clickbait detection. Expand
An Introduction to Variable and Feature Selection
TLDR
The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods. Expand
VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text
TLDR
Interestingly, using the authors' parsimonious rule-based model to assess the sentiment of tweets, it is found that VADER outperforms individual human raters, and generalizes more favorably across contexts than any of their benchmarks. Expand
Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling
TLDR
This paper reports on the PAN 2014 evaluation lab which hosts three shared tasks on plagiarism detection, author identification, and author profiling, which forms the largest collection of softwares for these tasks to date. Expand
Feature Selection for SVMs
TLDR
The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data. Expand
Readability Revisited: The New Dale-Chall Readability Formula