Learn More
We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes(More)
The language used in tweets from 1,300 different US counties was found to be predictive of the subjective well-being of people living in those counties as measured by representative surveys. Topics, sets of co-occurring words derived from the tweets using LDA, improved accuracy in predicting life satisfaction over and above standard demographic and(More)
Demographic lexica have potential for widespread use in social science, economic, and business applications. We derive predic-tive lexica (words and weights) for age and gender using regression and classification models from word usage in Facebook, blog, and Twitter data with associated demographic labels. The lexica, made publicly available, 1 achieved(More)
Although social media are widely studied, computational linguistics typically focuses on prediction tasks: • sentiment analysis • authorship attribution • personality prediction. .. Language analysis in social media can also be used to gain psychological insight. • 74,941 volunteers shared their gender and age, and took a personality questionnaire • 14.3m(More)
Depression is typically diagnosed as being present or absent. However, depression severity is believed to be continuously distributed rather than dichotomous. Severity may vary for a given patient daily and seasonally as a function of many variables ranging from life events to environmental factors. Repeated population-scale assessment of depression through(More)
This article is a system description and report on the submission of the World Well-Being Project from the University of Pennsylvania in the 'CLPsych 2015' shared task. The goal of the shared task was to automatically determine Twitter users who self-reported having one of two mental illnesses: post traumatic stress disorder (PTSD) and depression. Our(More)
Mental illnesses, such as depression and post traumatic stress disorder (PTSD), are highly underdiagnosed globally. Populations sharing similar demographics and personality traits are known to be more at risk than others. In this study, we characterise the language use of users disclosing their mental illness on Twit-ter. Language-derived personality and(More)
We present the task of predicting individual well-being, as measured by a life satisfaction scale, through the language people use on social media. Well-being, which encompasses much more than emotion and mood, is linked with good mental and physical health. The ability to quickly and accurately assess it can supplement multi-million dollar national surveys(More)
Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality(More)
The merging of knowledge from genomics, cellular signal transduction and molecular evolution is producing new paradigms of cancer analysis. Protein kinases have long been understood to initiate and promote malignant cell growth and targeting kinases to fight cancer has been a major strategy within the pharmaceutical industry for over two decades. Despite(More)