Theory In, Theory Out: The Uses of Social Theory in Machine Learning for Social Science

  title={Theory In, Theory Out: The Uses of Social Theory in Machine Learning for Social Science},
  author={Jason Radford and Kenneth Joseph},
  journal={Frontiers in Big Data},
Research at the intersection of machine learning and the social sciences has provided critical new insights into social behavior. At the same time, a variety of issues have been identified with the machine learning models used to analyze social data. These issues range from technical problems with the data used and features constructed, to problematic modeling assumptions, to limited interpretability, to the models' contributions to bias and inequality. Computational researchers have sought out… 

Figures from this paper

AI and social theory
It is argued that if the gaps identified here are addressed by further research, there is no reason why, in the future, the most advanced programme in social theory should not be led by AI-driven cumulative advances.
Measuring algorithmically infused societies.
It is argued that computational social scientists should rethink what aspects of algorithmically infused societies should be measured, how they should be measurement, and the consequences of doing so.
Composite Measures for Assessing Multidimensional Social Exclusion in Later Life: Conceptual and Methodological Challenges
A range of existing and novel approaches to constructing a composite measure of multidimensional social exclusion in later life are compared and their performances are assessed using variables that are causally related to social exclusion.
A Computational Social Science Approach to Understanding Predictors of Chafee Service Receipt
A forensic social science analysis of the National Youth in Transition Database is conducted to identify three major factors—youth age, youth time in care, and the state in which a youth is in care—that are most heavily associated with service receive.
Putting AI ethics to work: are the tools fit for purpose?
An assessment of these practical frameworks with the lens of known best practices for impact assessment and audit of technology, and identifies gaps in current AI ethics tools in auditing and risk assessment that should be considered going forward.
An Agent-based Model to Evaluate Interventions on Online Dating Platforms to Decrease Racial Homogamy
The present work shows the value of using an ABM approach to help understand the potential effects and side effects of different interventions that a platform could take, and shows that many previously hypothesized interventions online dating platforms could take to increase the number of interracial relationships from their website have limited effects.
Predicting Psychological Distress During the COVID-19 Pandemic: Do Socioeconomic Factors Matter?
Machine learning models consisting of demographic, socioeconomic, behavioural and epidemiological features can be used for fast ‘first-hand’ screening to diagnose mental health problems in Norwegians.
Trustworthy AI
The tutorial on “Trustworthy AI” is proposed to address six critical issues in enhancing user and public trust in AI systems, namely: bias and fairness, explainability, robust mitigation of adversarial attacks, improved privacy and security in model building, and being decent.
Subgroup Invariant Perturbation for Unbiased Pre-Trained Model Prediction
This work proposes a novel bias mitigation algorithm which is inspired from adversarial perturbation and uses the PSE metric, and a bias estimation metric termed as Precise Subgroup Equivalence to jointly measure the bias in model prediction and the overall model performance.


Exploring Patterns of Identity Usage in Tweets: A New Problem, Solution and Case Study
A comprehensive feature set is developed that leverages several avenues of recent NLP work on Twitter and is used to train a supervised classifier that outperforms a surprisingly strong rule-based baseline by 33%.
Prediction and explanation in social systems
It is argued that the increasingly computational nature of social science is beginning to reverse this traditional bias against prediction; however, it has also highlighted three important issues that require resolution, which will lead to better, more replicable, and more useful social science.
Fairness and Abstraction in Sociotechnical Systems
This paper outlines this mismatch with five "traps" that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science and suggests ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions.
Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls
Methodological and conceptual challenges for this emergent field of large-scale databases of human activity in social media are considered, with special attention to the validity and representativeness of social media big data analyses.
Garbage in, garbage out?: do machine learning application papers in social computing report where human-labeled training data comes from?
This paper investigates to what extent a sample of machine learning application papers in social computing give specific details about whether such best practices were followed, and finds a wide divergence in whether such practices were following and documented.
In Search of Coherence and Consensus: Measuring the Interpretability of Statistical Topics
This work studies measures of interpretability and proposes to measure topic interpretability from two perspectives: topic coherence and topic consensus and suggests topic consensus that measures how well the results of a crowdsourcing approach matches those given categories of topics.
Measurement and Fairness
It is argued that many of the harms discussed in the literature on fairness in computational systems are direct results of such mismatches, and it is shown how some of these harms could have been anticipated and mitigated if viewed through the lens of measurement modeling.
Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries
A framework for identifying a broad range of menaces in the research and practices around social data is presented, including biases and inaccuracies at the source of the data, but also introduced during processing.
Network Studies of Social Influence
Network analysts interested in social influence examine the social foundations for influence—the social relations that provide a basis for the alteration of an attitude or behavior by one network
Can an Algorithm be Agonistic? Ten Scenes from Life in Calculated Publics
This paper explores how political theory may help us map algorithmic logics against different visions of the political. Drawing on Chantal Mouffe’s theories of agonistic pluralism, this paper depicts