Semi-Automated Coding for Qualitative Research: A User-Centered Inquiry and Initial Prototypes

@article{Marathe2018SemiAutomatedCF,
  title={Semi-Automated Coding for Qualitative Research: A User-Centered Inquiry and Initial Prototypes},
  author={Megh Marathe and Kentaro Toyama},
  journal={Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems},
  year={2018}
}
  • Megh Marathe, K. Toyama
  • Published 21 April 2018
  • Computer Science
  • Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
Qualitative researchers perform an important and painstaking data annotation process known as coding. However, much of the process can be tedious and repetitive, becoming prohibitive for large datasets. Could coding be partially automated, and should it be? To answer this question, we interviewed researchers and observed them code interview transcripts. We found that across disciplines, researchers follow several coding practices well-suited to automation. Further, researchers desire automation… 

Tables from this paper

Cody: An Interactive Machine Learning System for Qualitative Coding
TLDR
Cody is the first coding system to allow users to define query-style code rules in combination with supervised ML and can extend manual annotations to unseen data to improve coding speed and quality.
Cody: An AI-Based System to Semi-Automate Coding for Qualitative Research
TLDR
Cody, an AI-based system that semi-automates coding through code rules and supervised ML, is introduced, finding that code rules provide structure and transparency and suggestions benefit coding quality rather than coding speed, increasing the intercoder reliability.
Developing and testing an automated qualitative assistant (AQUA) to support qualitative analysis
TLDR
An automated qualitative assistant (AQUA) is developed using a semiclassical approach, replacing Latent Semantic Indexing/Latent Dirichlet Allocation with a more transparent graph-theoretic topic extraction and clustering method, illustrating how primary care researchers may use AQUA to rapidly and accurately code large text datasets.
Accelerating Deductive Coding of Qualitative Data: An Experimental Study on the Applicability of Crowdsourcing
TLDR
An interactive coding system to support crowdsourced deductive coding of semi-structured qualitative data is presented and indicates that crowdsourced coding is an applicable strategy for accelerating a strenuous task.
Supporting Serendipity
TLDR
This study provides a deep investigation of task delegability in human-AI collaboration in the context of qualitative analysis, and offers directions for the design of AI assistance that honor serendipity, human agency, and ambiguity.
Putting Tools in Their Place: The Role of Time and Perspective in Human-AI Collaboration for Qualitative Analysis
TLDR
It is shown that the stage of qualitative analysis matters for how scholars believe AI can and should be used, and how designing for human-AI collaboration in qualitative analysis necessitates considering tradeoffs in scale, abstraction, and task delegation.
Reliability and Inter-rater Reliability in Qualitative Research
What does reliability mean for building a grounded theory? What about when writing an auto-ethnography? When is it appropriate to use measures like inter-rater reliability (IRR)? Reliability is a
Supporting Serendipity: Opportunities and Challenges for Human-AI Collaboration in Qualitative Analysis
Qualitative inductive methods are widely used in CSCW and HCI research for their ability to generatively discover deep and contextualized insights, but these inherently manual and
The Exploratory Labeling Assistant: Mixed-Initiative Label Curation with Large Document Collections
TLDR
This paper proposes an interactive visual data analysis method that integrates human-driven label ideation, specification and refinement with machine-driven recommendations and uses unsupervised machine learning methods that provide suggestions and data summaries.
From Detectables to Inspectables: Understanding Qualitative Analysis of Audiovisual Data
TLDR
This work investigated researchers’ transcription and annotation practice, their overall analysis workflow, and the prevalence of direct analysis of audiovisual recordings, finding that a key task was locating and analyzing inspectables, interesting segments in recordings.
...
1
2
3
...

References

SHOWING 1-10 OF 50 REFERENCES
Tools for Analyzing Qualitative Data: The History and Relevance of Qualitative Data Analysis Software
TLDR
This chapter provides an overview of tasks involved in analyzing qualitative data, with a focus on increasingly complex projects, and identifies the increasingly diverse array of expected features and functions in most of the current software programs.
Semi-Automatic Content Analysis of Qualitative Data
TLDR
This work presents a semi-automatic system that leverages natural language processing (NLP) and machine learning (ML) techniques for initial automatic coding, which human coders then review and correct and are subsequently used to train a higher performing model for machine annotation.
Computer assessment of interview data using latent semantic analysis
TLDR
An instrument that uses LSA technology was developed to identify misconceptions and assess conceptual change in students’ thinking and its accuracy reached 90%.
Computer-Aided Qualitative Data Analysis: Theory, Methods and Practice
TLDR
An overview of Computer-Aided Methods in Qualitative Research and Theory Building and an Overview of Software are presented.
A Computational Study of Commonsense Science: An Exploration in the Automated Analysis of Clinical Interview Data
TLDR
It is attempted to show that it is possible to use techniques from computational linguistics to analyze data from commonsense science interviews in a manner that may provide convergent support for the work of human analysts.
Inquire: Large-scale Early Insight Discovery for Qualitative Research
TLDR
In Inquire, a tool designed to enable qualitative exploration of utterances in social media and large-scale texts, it is shown how queries become a part of the inductive process, enabling researchers to try multiple ideas while gaining intuition and discovering less-obvious insights.
Manual or electronic? The role of coding in qualitative data analysis
TLDR
The author looks at both the methods to code data in two rather different projects in which the data were collected mainly by in-depth interviewing and concludes that the choice will be dependent on the size of the project, the funds and time available, and the inclination and expertise of the researcher.
Users' Experiences with Qualitative Data Analysis Software
TLDR
Users of qualitative data analysis software in most cases use the computer as an organizational, time-saving tool and take special care to maintain close relationships with both the data and the respondents.
Coder training: Theoretical training or practical socialization?
TLDR
Fragments of a transcript of a coder training are presented that suggest that inter-coder-reliability is improved not only by communicating the coding instructions to coders (theoretical training) but also by socializing coders into practical rules which are not part of the coding Instructions and are not warranted by them.
The Coding Manual for Qualitative Researchers
TLDR
This chapter discusses writing Analytic Memos About Narrative and Visual Data and exercises for Coding and Qualitative Data Analytic Skill Development.
...
1
2
3
4
5
...