Svitlana Volkova

Learn More
Automatically inferring user demographics from social media posts is useful for both social science research and a range of downstream applications in marketing and politics. We present the first extensive study where user behaviour on Twitter is used to build a predictive model of income. We apply non-linear methods for regression, i.e. Gaussian Processes,(More)
Existing models for social media personal analytics assume access to thousands of messages per user, even though most users author content only sporadically over time. Given this sparsity, we: (i) leverage content from the local neighborhood of a user; (ii) evaluate batch models as a function of size and the amount of messages in various types of(More)
Different demographics, e.g., gender or age, can demonstrate substantial variation in their language use, particularly in informal contexts such as social media. In this paper we focus on learning gender differences in the use of subjective language in English, Spanish, and Russian Twitter data, and explore cross-cultural differences in emoticon and hashtag(More)
We demonstrate an approach to predict latent personal attributes including user demographics, online personality, emotions and sentiments from texts published on Twitter. We rely on machine learning and natural language processing techniques to learn models from user communications. We first examine individual tweets to detect emotions and opinions(More)
Latent author attribute prediction in social media provides a novel set of conditions for the construction of supervised classification models. With individual authors as training and test instances, their associated content (“features”) are made available incrementally over time, as they converse over discussion forums. We propose various approaches to(More)
Monitoring epidemic crises, caused by rapid spread of infectious animal diseases, can be facilitated by the plethora of information about disease-related events that is available online. Therefore, the ability to use this information to perform domain-specific entity recognition and event-related sentence classification, which in turn can support time and(More)
We examine communications in a social network to study user emotional contrast – the propensity of users to express different emotions than those expressed by their neighbors. Our analysis is based on a large Twitter dataset, consisting of the tweets of 123,513 users from the USA and Canada. Focusing on Ekman’s basic emotions, we analyze differences between(More)
The paper considers theoretical problems and results of experiments on recognition of anesthesia stages from electroencephalograms (EEGs) by methods based on analyzing the parameters of approximated entropy. It is shown that a discrete sequence of point entropy estimates can be efficiently used for describing regular and chaotic components of the EEG signal(More)
Social media services such as Twitter and Facebook are virtual environments where people express their thoughts, emotions, and opinions and where they reveal themselves to their peers. We analyze a sample of 123,000 Twitter users and 25 million of their tweets to investigate the relation between the opinions and emotions that users express and their(More)
We study subjective language in social media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide(More)