Disentangling Topic Models: A Cross-cultural Analysis of Personal Values through Words

  title={Disentangling Topic Models: A Cross-cultural Analysis of Personal Values through Words},
  author={Steven R. Wilson and Rada Mihalcea and Ryan L. Boyd and James W. Pennebaker},
We present a methodology based on topic modeling that can be used to identify and quantify sociolinguistic differences between groups of people, and describe a regression method that can disentangle the influences of different attributes of the people in the group (e.g., culture, gender, age). As an example, we explore the concept of personal values, and present a cross-cultural analysis of value-behavior relationships spanning writers from the United States and India. 

Figures and Tables from this paper

Global Reactions to the Cambridge Analytica Scandal: A Cross-Language Social Media Study
A cross-language study of the Cambridge Analytica scandal to compare how people speaking different languages react to data privacy breaches reveals a similar emphasis on Zuckerberg's hearing in the US Congress and the scandal’s impact on political issues.
Global Reactions to the Cambridge Analytica Scandal: An Inter-Language Social Media Study
An inter-language study of the Cambridge Analytica scandal reveals a similar emphasis on Zuckerberg's hearing in the US Congress and the scandal’s impact on political issues, and shows that while English speakers tend to attribute responsibilities to companies, Spanish speakers are more likely to connect them to people.
Understanding the Psycho-Sociological Facets of Homophily in Social Network Communities
Empirical results based on the psychosociological behavior show that friends networks exhibit homophily, whereas relatives and colleagues networks do not exhibit such homophilic behavior, and it is shown that such empirical evidence can be used as features for the tasks of community detection and link prediction.
#sendeanlat (#tellyourstory): Text Analyses of Tweets About Sexual Assault Experiences
On 11 February 2015, a 20-year-old university student, Ozgecan Aslan, was violently murdered in an attempted rape in Mersin, southern Turkey. This event led to a mass Twitter protest in the country.
You’re Only Jung Once: Building Generalized Motivational Systems Theories Using Contemporary Research on Language
In their target article for this issue of Psychological Inquiry, Becker and Neuberg have provided a thoughtful reflection on the intersection of motivational subsystems as well as their elaborate
From Text to Thought: How Analyzing Language Can Advance Psychological Science
It is proposed that language offers a unique window into psychology and two forms of language analysis-natural-language processing and comparative linguistics-are contributing to how the authors understand topics as diverse as emotion, creativity, and religion and overcoming obstacles related to statistical power and culturally diverse samples.
Challenges and Strategies in Cross-Cultural NLP
Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages. However, it is important to
Can Social Ontological Knowledge Representations be Measured Using Machine Learning?
Personal Social Ontology (PSO), it is proposed, is how an individual perceives the ontological properties of terms, and the use of principal social perceptions is put forward as a viable method to feature engineer such texts.
Butter Lyrics Over Hominy Grit: Comparing Audio and Psychology-Based Text Features in MIR Tasks
An initial assessment of the usefulness of features drawn from lyrics for various fields, such as MIR and Music Psychology, by assessing the performance of lyric-based text features on 3 MIR tasks, in comparison to audio features.
“Judge me by my size (noun), do you?” YodaLib: A Demographic-Aware Humor Generation Framework
This work uses the BERT platform to predict location-biased word fillings in incomplete sentences, and fine-tune BERT to classify location-specific humor in a sentence to produce YodaLib, a fully-automated Mad Libs style humor generation framework.


Values in Words: Using Language to Evaluate and Understand Personal Values
It is suggested that self-report questionnaires for abstract and complex phenomena, such as values, are inadequate for painting an accurate picture of individual mental life and free response language data and language modeling show greater promise for understanding both the structure and content of concepts such asvalues.
Dimensions of Self-Expression in Facebook Status Updates
Beyond simply indicating topicality of posts, this study provides insight into how status updates are used for selfexpression, and the generation of theoretical frameworks from wholly empirical data (such as naturalistic Internet speech) using the MEM.
Validity problems comparing values across cultures and possible solutions.
The authors argue that commonly used ranking and rating methods of value surveys may have low validity in cross-cultura l value comparisons because participants' reports about values can be affected
Cultural psychology.
  • S. Heine, M. Ruby
  • Psychology
    Wiley interdisciplinary reviews. Cognitive science
  • 2010
A number of cultural differences in how people perceive the self are summarized, and the behavioral consequences that follow from these differences, in the domains of internal and external attribution styles, motivations for self-enhancement, approach/avoidance, primary and secondary control, as well as motivations for distinctiveness and conformity.
A Rose by Any Name? The Values Construct
Definitional inconsistency has been epidemic in values theory and research. An abbreviated review of values-related theory and research is provided, and 5 aspects of the values construct that may
On the Place of Culture in Psychological Science
Abstract Based on a positivist-empiricist mode of inquiry, mainstream psychology has been vigorously engaged in characterizing human lives in terms of mechanistic and individualistic constructions,
Revealing Dimensions of Thinking in Open-Ended Self-Descriptions: An Automated Meaning Extraction Method for Natural Language.
Comparing Twitter and Traditional Media Using Topic Models
This paper empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling, and finds interesting and useful findings for downstream IR or DM applications.