Competitive perspective identification via topic based refinement for online documents
Online debate sites are a large source of informal and opinion-sharing dialogue on current socio-political issues. Inferring users’ stance (PRO or CON) towards discussion topics in domains such as politics or news is an important problem, and is of utility to researchers, government organizations, and companies. Predicting users’ stance supports identification of social and political groups, building of better recommender systems, and personalization of users’ information preferences to their ideological beliefs. In this paper, we develop a novel collective classification approach to stance classification, which makes use of both structural and linguistic features, and which collectively labels the posts’ stance across a network of the users’ posts. We identify both linguistic features of the posts and features that capture the underlying relationships between posts and users. We use probabilistic soft logic (PSL) (Bach et al., 2013) to model post stance by leveraging both these local linguistic features as well as the observed network structure of the posts to reason over the dataset. We evaluate our approach on 4FORUMS (Walker et al., 2012b), a collection of discussions from an online debate site on issues ranging from gun control to gay marriage. We show that our collective classification model is able to easily incorporate rich, relational information and outperforms a local model which uses only linguistic information.