Historiographic Mapping of Knowledge Domains Literature

Abstract

To better understand the topic of this colloquium, we have created a series of databases related to knowledge domains [dynamic systems (small world/Milgram), information visualization (Tufte), co-citation (Small), bibliographic coupling (Kessler), and scientometrics (Scientometrics)]. I have used a software package called HistCite which generates chronological maps of subject (topical) collections resulting from searches of the ISI Web of Science or ISI citation indexes (SCI, SSCI, and/or AHCI) on CD-ROM. When a marked list is created on WoS, an export file is created which contains all cited references for each source document captured. These bibliographic collections, saved as ASCII files, are processed by HistCite in order to generate chronological and other tables as well as historiographs which highlight the most-cited works in and outside the collection. HistCite also includes a module for detecting and editing errors or variations in cited references as well as a vocabulary analyzer which generates both ranked word lists and word pairs used in the collection. Ideally the system will be used to help the searcher quickly identify the most significant work on a topic and trace its year-by-year historical development. In addition to the collections mentioned above, historiographs based on collections of papers that cite the Watson-Crick 1953 classic paper identifying the helical structure of DNA were created. Both year-by-year as well as month-by-month displays of papers from 1953 to 1958 were necessary to highlight the publication activity of those years.. I was reluctant to accept Katy Borner’s invitation to give this keynote talk since I had never heard the term “Knowledge Domains” before. Furthermore, I am not an expert on the subjection of visualization. Her misperception on that point was probably due to a paper I recently published in the special issue of the Journal of the American Society for Information Science and Technology on visualization. The issue editor Chaomei Chen of Drexel University had roped me into that contribution since he had heard about my interest in mapping from colleagues Howard White and Kate McCain at Drexel. Over a several month period, my staff worked with Katy to identify various literature sub-sets she perceived as being relevant to the knowledge domain literature. To facilitate that process, we used a software package still in development called HistCite. This system has evolved over the past several years and traces its roots to a project in 1964 conducted by me and Irving Sher, who died several years ago and sponsored by Harold Wooster of the U.S. Air Force. “The Uses of Citation Data in Writing the History of Science,” is available at my web page at www.eugenegarfield.org. We interested Wooster in the idea when we completed our NIHsponsored work on the Genetics Citation Index project. The GCI eventually led to publication of the 1961 volumes of the Science Citation Index in 1964. Sher and I had speculated on the possibility that the cited references in scholarly papers could be used to create topological maps of science. To test this theory, we used Isaac Asimov’s book The Genetic Code 4 as a model. Asimov, a professor of biochemistry, better known as the prolific science fiction writer, identified the 40 key scientific events in the development of DNA science from the time of Gregor Mendel until the 1961 Nobel work of Marshall Nirenberg at NIH. We used about 60 published papers mentioned by Asimov to create a mini citation index from the 1,000 odd references they cited. From these data we were able to draw the first citation-based historiograph shown in Figure 1 (http://garfield.library.upenn.edu/papers/finaloverlay.pdf). Our interest in the graph theoretical aspects of citation networks was further reflected in a thesis by Ralph Garner at Drexel University in 1967 (http://www.garfield.library.upenn.edu/rgarner.pdf), an ISI employee at that time. Each box in this historiograph is a key event. The colored connecting lines indicate various levels or strengths of citation linkage. Two decades later, the DNA project data were used as a model in a paper by two social networks researchers at the University of Pittsburgh, Norman P. Hummon and Patrick Doreian. Except for their work, the original idea was basically ignored until a few years ago when my long-time colleague, geneticist Alexander I. Pudovin and I discussed the possibility of reviving the original idea of writing a program that would create historiographs algorithmically. This lead to the HistCite software described below. The process was first publicly discussed at a University of Pittsburgh conference and then at the ASIST Annual Meeting in November 2002. The ASIS&T paper includes, among others, a HistCite analysis for “gene flow,” an area in population genetics of interest to Pudovkin. From those initial trials, the software has evolved to its present form.

DOI: 10.1177/0165551504042802

Extracted Key Phrases

23 Figures and Tables

0102030'04'05'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

135 Citations

Semantic Scholar estimates that this publication has 135 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Garfield2004HistoriographicMO, title={Historiographic Mapping of Knowledge Domains Literature}, author={Eugene Garfield}, journal={J. Information Science}, year={2004}, volume={30}, pages={119-145} }