Automatically Segmenting Oral History Transcripts


Dividing oral histories into topically coherent segments can make them more accessible online. People regularly make judgments about where coherent segments can be extracted from oral histories. But making these judgments can be taxing, so automated assistance is potentially attractive to speed the task of extracting segments from openended interviews. When different people are asked to extract coherent segments from the same oral histories, they often do not agree about precisely where such segments begin and end. This low agreement makes the evaluation of algorithmic segmenters challenging, but there is reason to believe that for segmenting oral history transcripts, some approaches are more promising than others. The BayesSeg algorithm performs slightly better than TextTiling, while TextTiling does not perform significantly better than a uniform segmentation. BayesSeg might be used to suggest boundaries to someone segmenting oral histories, but this segmentation task needs to be better defined.

Extracted Key Phrases

3 Figures and Tables

Cite this paper

@article{Shaw2015AutomaticallySO, title={Automatically Segmenting Oral History Transcripts}, author={Ryan Shaw}, journal={CoRR}, year={2015}, volume={abs/1509.08842} }