Corpus ID: 220845859

Between Subjectivity and Imposition: Power Dynamics in Data Annotation for Computer Vision

@article{Miceli2020BetweenSA,
  title={Between Subjectivity and Imposition: Power Dynamics in Data Annotation for Computer Vision},
  author={Milagros Miceli and M. Schuessler and Tianling Yang},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.14886}
}
The interpretation of data is fundamental to machine learning. This paper investigates practices of image data annotation as performed in industrial contexts. We define data annotation as a sense-making practice, where annotators assign meaning to data through the use of labels. Previous human-centered investigations have largely focused on annotators subjectivity as a major cause for biased labels. We propose a wider view on this issue: guided by constructivist grounded theory, we conducted… Expand
Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?
TLDR
This commentary proposes moving the research focus beyond biasoriented framings by adopting a power-aware perspective to “study up” ML datasets by accounting for historical inequities, labor conditions, and epistemological standpoints inscribed in data. Expand
A Survey on Bias in Visual Datasets
TLDR
There is no such thing as a bias-free dataset, so scientists and practitioners must become aware of the biases in their datasets and make them explicit, and a checklist that can be used to spot different types of bias during visual dataset collection is proposed. Expand
Abusive Language Detection in Heterogeneous Contexts: Dataset Collection and the Role of Supervised Attention
TLDR
This work provides an annotated dataset of abusive language in over 11,000 comments from YouTube and proposes an algorithm that uses a supervised attention mechanism to detect and categorize abusive content using multi-task learning. Expand
Computer Vision and Conflicting Values: Describing People with Automated Alt Text
TLDR
This paper analyzes the policies that Facebook has adopted with respect to identity categories, such as race, gender, age, etc., and the company's decisions about whether to present these terms in alt text, and describes an alternative---and manual---approach practiced in the museum community, focusing on how museums determine what to include in altText descriptions of cultural artifacts. Expand
Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices
TLDR
This paper identifies four key issues that hinder the documentation of image datasets and the effective retrieval of production contexts and proposes reflexivity, understood as a collective consideration of social and intellectual factors that lead to praxis, as a necessary precondition for documentation. Expand
Lifting the curtain: Strategic visibility of human labour in AI-as-a-Service
Artificial Intelligence-as-a-Service (AIaaS) empowers individuals and organisations to access AI on-demand, in either tailored or ‘off-the-shelf’ forms. However, institutional separation betweenExpand
PASS: An ImageNet replacement for self-supervised pretraining without humans
TLDR
This work proposes an unlabelled dataset PASS: Pictures without humAns for Self-Supervision, which shows that model pretraining is often possible while using safer data, and provides the basis for a more robust evaluation of pretraining methods. Expand
The Who in Explainable AI: How AI Background Shapes Perceptions of AI Explanations
TLDR
A mixed-methods study of how two different groups of whos—people with and without a background in AI—perceive different types of AI explanations, finding that both groups had unwarranted faith in numbers, to different extents and for different reasons. Expand
Tinkering: A Way Towards Designing Transparent Algorithmic User Interfaces
TLDR
The proposed approach of combining tinkering with transparent UI’s serves two potential purposes: first, the exploratory nature of tinkering has the ability to make the algorithmic aspects transparent without hurting users experience (UX), while providing flexibility and sufficient control in the personalized interactive experience. Expand
Wisdom for the Crowd: Discoursive Power in Annotation Instructions for Computer Vision
TLDR
The preliminary findings indicate that annotation instructions reflect worldviews imposed on workers and, through their labor, on datasets, and that for-profit goals drive task instructions and that managers and algorithms make sure annotations are done according to requesters' commands. Expand
...
1
2
...

References

SHOWING 1-10 OF 98 REFERENCES
Data Feminism. Œe
  • 2020
Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects
  • Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018),
  • 2018
Data Vision: Learning to See Through Algorithmic Abstraction
TLDR
This paper examines how the often-divergent demands of mechanization and discretion manifest in data analytic learning environments and shows that effective data vision requires would-be analysts to straddle the competing demands of formal abstraction and empirical contingency. Expand
Sorting Šings out: Classi€cation and Its Consequences
  • 1999
Language and Symbolic Power (new ed.)
  • 1992
Outline of a Šeory of Practice
  • 1977
Garbage in, garbage out?: do machine learning application papers in social computing report where human-labeled training data comes from?
TLDR
This paper investigates to what extent a sample of machine learning application papers in social computing give specific details about whether such best practices were followed, and finds a wide divergence in whether such practices were following and documented. Expand
How We've Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis
TLDR
It is found that the majority of image databases rarely contain underlying source material for how race and gender identities are defined and annotated, and that the lack of critical engagement with this nature renders databases opaque and less trustworthy. Expand
How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation
TLDR
This paper building on the work of other CSCW and HCI researchers in describing the ways that scientists, scholars, engineers, and others work with their data, through analyses of interviews with 21 data science professionals sets five approaches to data along a dimension of interventions. Expand
Še Metric Society: On the ‰anti€cation of the Social / Ste‚en
  • Mau ; Translated by Sharon Howe. Polity,
  • 2019
...
1
2
3
4
5
...