The Future of Coding: A Comparison of Hand-Coding and Three Types of Computer-Assisted Text Analysis Methods
@article{Nelson2018TheFO, title={The Future of Coding: A Comparison of Hand-Coding and Three Types of Computer-Assisted Text Analysis Methods}, author={Laura K. Nelson and Derek Burk and Marcel L Knudsen and Leslie McCall}, journal={Sociological Methods \& Research}, year={2018}, volume={50}, pages={202 - 237} }
Advances in computer science and computational linguistics have yielded new, and faster, computational approaches to structuring and analyzing textual data. These approaches perform well on tasks like information extraction, but their ability to identify complex, socially constructed, and unsettled theoretical concepts—a central goal of sociological content analysis—has not been tested. To fill this gap, we compare the results produced by three common computer-assisted approaches—dictionary…
Figures and Tables from this paper
66 Citations
Examining Sentiment in Complex Texts. A Comparison of Different Computational Approaches
- Computer ScienceFrontiers in Big Data
- 2022
A comparison of dictionary and scaling methods used in predicting the sentiment of German literature reviews to the “gold standard” of human-coded sentiments provides a practical guide for researchers to select an appropriate method and degree of pre-processing when working with complex texts.
All work and no play: A text analysis
- Computer ScienceInternational Journal of Market Research
- 2019
Some of the key contemporary themes in text analytics and the likely future role of this method within market research and insight are discussed, including Q’s text analysis component and Google Cloud Natural Language.
Qualitative Coding in the Computational Era: A Hybrid Approach to Improve Reliability and Reduce Effort for Coding Ethnographic Interviews
- SociologySocius: Sociological Research for a Dynamic World
- 2021
Sociologists have argued that there is value in incorporating computational tools into qualitative research, including using machine learning to code qualitative data. Yet standard computational…
Text mining for social science - The state and the future of computational text analysis in sociology.
- Sociology, Computer ScienceSocial science research
- 2022
Measuring and Visualizing Coders’ Reliability: New Approaches and Guidelines From Experimental Data
- BusinessSociological Methods & Research
- 2020
This study investigates inter- and intracoder reliability, proposing a new approach based on social network analysis (SNA) and exponential random graph models (ERGM) that is compatible with current ERGM models.
Separating the wheat from the chaff: A topic and keyword-based procedure for identifying research-relevant text*✰
- Computer Science
- 2021
Epistemological Considerations of Text Mining: Implications for Systematic Literature Review
- Computer ScienceMathematics
- 2021
This article proposes to rethink the epistemological principles of text mining, by returning to the qualitative analysis of its meaning and structure, and presents alternatives, applicable to the process of constructing lexical matrices for the analysis of a complex textual corpus.
Generalized word shift graphs: a method for visualizing and explaining pairwise comparisons between texts
- Computer ScienceEPJ Data Sci.
- 2021
Generalized word shift graphs are introduced, visualizations which yield a meaningful and interpretable summary of how individual words contribute to the variation between two texts for any measure that can be formulated as a weighted average.
The Augmented Social Scientist: Using Sequential Transfer Learning to Annotate Millions of Texts with Human-Level Accuracy
- Computer ScienceSociological Methods & Research
- 2022
It is shown via an experiment that an expert can train a precise, efficient automatic classifier in a very limited amount of time and that, under certain conditions, expert-trained models produce better annotations than humans themselves.
Linguistic, cultural, and narrative capital: computational and human readings of transfer admissions essays
- EducationJournal of Computational Social Science
- 2022
Variation in college application materials related to social stratification is a contentious topic in social science and national discourse in the United States. This line of research has also…
References
SHOWING 1-10 OF 68 REFERENCES
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
- SociologyPolitical Analysis
- 2013
Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have…
Computational Grounded Theory: A Methodological Framework
- Computer Science
- 2020
This article proposes a three-step methodological framework called computational grounded theory, which combines expert human knowledge and hermeneutic skills with the processing power and pattern…
A Method of Automated Nonparametric Content Analysis for Social Science
- Computer Science
- 2010
This work develops a method that gives approximately unbiased estimates of category proportions even when the optimal classifier performs poorly, and illustrates with diverse data sets, including the daily expressed opinions of thousands of people about the U.S. presidency.
Computer-Aided Content Analysis of Digitally Enabled Movements
- Computer Science
- 2013
*With the emergence of the Arab Spring and the Occupy movements, interest in the study of movements that use the Internet and social networking sites has grown exponentially. However, our inability…
Automatic Extraction of Facts from Press Releases to Generate News Stories
- Computer ScienceANLP
- 1992
JASPER is a fact extraction system recently developed and deployed by Carnegie Group for Reuters Ltd, which uses a template-driven approach, partial understanding techniques, and heuristic procedures to extract certain key pieces of information from a limited range of text.
Coder Reliability and Misclassification in the Human Coding of Party Manifestos
- Computer SciencePolitical Analysis
- 2012
The findings indicate that misclassification is a serious and systemic problem with the current CMP data set and coding process, suggesting the CMP scheme should be significantly simplified to address reliability issues.
Information Extraction
- Computer ScienceLecture Notes in Computer Science
- 2002
This paper discusses attempts to derive templates directly from corpora; to derive knowledge structures and lexicons directly from Corpora, including discussion of the recent LE project ECRAN which attempted to tune existing lexicons to new corpora.
Treating Words as Data with Error: Uncertainty in Text Statements of Policy Positions
- Computer Science
- 2009
This work characterizes processes by which CMP data are generated, and shows how to correct biased inferences, in recent prominently published work, derived from statistical analyses of error-contaminated C MP data.