- York: Penguin
library catalogues have not commonly been used to record historical information about provenance, especially date of acquisition, which would be of particular value for the retrospective construction of content sets. Mining also demands machine-readable text, which is problematic with certain typefaces for printed material and near to impossible with manuscript. Mining of tabular data such as that included in the published reports of the London Medical Officers of Health, recently digitised by the Wellcome Library,11 is only possible because the tables themselves have been separately re-keyed and presented in appropriate formats. For historians interested in large-scale analysis of images, there is also the need to separate illustrations from text, while retaining some sense of the original context of the image. None of these are insurmountable obstacles. Librarians have traditionally managed data about the items they hold as adeptly as they have cared for the physical objects: the transition to digital content sets and to the application of content mining simply requires that these skills be applied a little differently, and without preciousness about the correspondence between physical holdings and virtual repository. Building a digital library for the history of medicine may be hard, but then again being a librarian has never been easy either!