GutenTag: an NLP-driven Tool for Digital Humanities Research in the Project Gutenberg Corpus

@inproceedings{Brooke2015GutenTagAN,
  title={GutenTag: an NLP-driven Tool for Digital Humanities Research in the Project Gutenberg Corpus},
  author={Julian Brooke and Adam Hammond and Graeme Hirst},
  booktitle={CLfL@NAACL-HLT},
  year={2015}
}
This paper introduces a software tool, GutenTag, which is aimed at giving literary researchers direct access to NLP techniques for the analysis of texts in the Project Gutenberg corpus. We discuss several facets of the tool, including the handling of formatting and structure, the use and expansion of metadata which is used to identify relevant subcorpora of interest, and a general tagging framework which is intended to cover a wide variety of future NLP modules. Our hope that the shared ground… CONTINUE READING

Citations

Publications citing this paper.
Showing 1-10 of 10 extracted citations

References

Publications referenced by this paper.