IUCL at SemEval-2016 Task 6: An Ensemble Model for Stance Detection in Twitter

  title={IUCL at SemEval-2016 Task 6: An Ensemble Model for Stance Detection in Twitter},
  author={Can Liu and Wen Li and Bradford Demarest and Yue (Eleanor) Chen and Sara Couture and Daniel Dakota and Nikita Haduong and Noah Kaufman and Andrew Lamont and Manan Pancholi and Kenneth Steimel and Sandra K{\"u}bler},
We present the IUCL system, based on supervised learning, for the shared task on stance detection. Our official submission, the random forest model, reaches a score of 63.60, and is ranked 6th out of 19 teams. We also use gradient boosting decision trees and SVM and merge all classifiers into an ensemble method. Our analysis shows that random forest is good at retrieving minority classes and gradient boosting majority classes. The strengths of different classifiers wrt. precision and recall… 

Tables from this paper

LTRC IIITH at IBEREVAL 2017: Stance and Gender Detection in Tweets on Catalan Independence
A supervised system using Support Vector Machines with radial basis function kernel to identify the stance and gender of the tweeter using various character level and word level features to achieve a macro-average and accuracy of 0.46 for stance detection in both Spanish and Catalan.
An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System
A baseline supervised classification system for stance detection developed using the same dataset that uses various machine learning techniques to achieve an accuracy of 58.7% on 10-fold cross validation is presented.
Stance Identification by Sentiment and Target Detection
The characteristics of different types of topics, and the interaction among sentiment, target, and stance in a sentence are discussed, and an approach without the need of stancelabeled data to identify stance incorporating the findings of their interaction is proposed.
Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations
A new way to represent social-media conversations as binarized constituency trees that allows comparing features in source-posts and their replies effectively and to use convolution units in Tree LSTMs that are better at learning patterns in features obtained from the source and reply posts is proposed.
Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention
T-PAN is the first in the topical stance detection literature, that uses deep learning within a two-phase architecture, and proposes a Long Short-Term memory (LSTM) based deep neural network for each phase, and embed attention at each of the phases.
Classifier Stacking for Native Language Identification
This paper reports the contribution (team WLZ) to the NLI Shared Task 2017 (essay track), which achieves an accuracy of 86.55%, which is among the best performing systems of this shared task.
Friends and Enemies of Clinton and Trump: Using Context for Detecting Stance in Political Tweets
A novel approach for detecting stance in Twitter that defines a set of features in order to consider the context surrounding a target of interest with the final aim of training a model for predicting the stance towards the mentioned targets.
Semi-supervised Stance-Topic Model for Stance Classification on Social Media
A semi-supervised topic model, Semi-Supervised Stance Topic Model (SSTM), that model stances and topics of the posts on social media, that incorporates the structural information of the Posts, i.e., gender information, location information and time information, to aggregate posts for alleviating the context sparsity of the post.
Twitter Stance Detection — A Subjectivity and Sentiment Polarity Inspired Two-Phase Approach
This paper addresses the problem of detecting the stance of given tweets, with respect to given topics, from user-generated text (tweets), using the SemEval 2016 stance detection task dataset and develops a two-phase feature-driven model.
Stance Detection Based on Ensembles of Classifiers
The method of constructing ensembles proposed in this paper, which is based on a cross-validation procedure, makes it possible to optimize the parameters of the base classifiers, evaluate the effectiveness of each combination of classifiers included in the set, and select the optimal combination.


SemEval-2016 Task 6: Detecting Stance in Tweets
A shared task on detecting stance from tweets: given a tweet and a target entity (person, organization, etc.), automatic natural language systems must determine whether the tweeter is in favor of the given target, against thegiven target, or whether neither inference is likely.
Stance Classification of Ideological Debates: Data, Models, Features, and Constraints
How the performance of a learning-based stance classification system varies with the amount and quality of the training data, the complexity of the underlying model, the richness of the feature set, as well as the application of extra-linguistic constraints is examined.
Recognizing Stances in Online Debates
This paper presents an unsupervised opinion analysis method for debate-side classification, i.e., recognizing which stance a person is taking in an online debate, and shows that this method is substantially better than challenging baseline methods.
Cats Rule and Dogs Drool!: Classifying Stance in Online Debate
The results suggest that methods that take into account the dialogic context of such posts might be fruitful, and demonstrate that the number of subjective expressions varies across debates, a fact correlated with the performance of systems sensitive to sentiment-bearing terms.
Recognizing Stances in Ideological On-Line Debates
This work constructs an arguing lexicon automatically from a manually annotated corpus and builds supervised systems employing sentiment and arguing opinions and their targets as features, which perform substantially better than a distribution-based baseline.
Feature Selection for Highly Skewed Sentiment Analysis Tasks
The finding shows that feature selection is capable of improving the classification accuracy only in balanced or slightly skewed situations, and that TF IDF2 can help in identifying the minority class even in highly imbalanced cases.
GloVe: Global Vectors for Word Representation
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Natural Language Processing (Almost) from Scratch
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are