CiteSeer: an automatic citation indexing system
@inproceedings{Giles1998CiteSeerAA, title={CiteSeer: an automatic citation indexing system}, author={C. Lee Giles and Kurt D. Bollacker and Steve Lawrence}, booktitle={DL '98}, year={1998} }
We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to parse citations, identify citations to the same paper in different formats, and identify the context of citations in the body of articles. CiteSeer provides most of the advantages of traditional (manually constructed) citation indexes (e.g. the ISI citation indexes), including: literature retrieval by following…
870 Citations
Exploring Automatic Citation Classification
- Computer Science
- 2008
A new citation scheme that is easier to work with than most, a document acquisition and citation annotation tool that helps with the development of annotated citation corpora, and some experiments with automating citation classification are presented.
Autonomous citation matching
- Computer ScienceAGENTS '99
- 1999
This work presents machine learning techniques that identify variant forms of citations to the same paper, and presents a number of algorithms that perform best and are sufficiently accurate for unassisted use in an autonomous citation indexing system.
CAD: an algorithm for citation-anchors detection in research papers
- Computer ScienceScientometrics
- 2018
The paper proposes an algorithm, CAD, for identification of citation-anchors and its in-text citation frequency based on different rules and shows that CAD algorithm improved F-score by 44% and 37% respectively on both J.UCS and CiteSeer dataset over the contemporary technique.
Extracting Citation Metadata from Online Publication Lists Using BLAST
- Computer SciencePAKDD
- 2004
This work presents a new methodology based on protein sequence alignment tool, and develops a template generating system to transform known semi-structured citation strings into protein sequences, which are saved as templates in a database.
Lessons Learned: The Complexity of Accurate Identification of in-Text Citations
- Computer ScienceInt. Arab J. Inf. Technol.
- 2015
The accurate identification of in-text citations will help information retrieval systems, digital libraries and citation indexes, as well as highlighting the problems (mathematical ambiguities, wrong allotments, commonality in content and string variation) in identifying in- text citations from scientific documents.
An annotation scheme for citation function Conference or Workshop Item
- Computer Science
- 2019
Here, an annotation scheme with 12 categories is introduced, and an agreement study is presented on the interplay of the discourse structure of a scientific argument with formal citations.
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
- Computer ScienceIEEE Access
- 2017
A taxonomy and workable system is proposed, which utilizes a set of heuristics build from detailed study and is applied on unseen diversified data set taken from the Journal of Universal Computer Science and CiteSeer.
Rule based Autonomous Citation Mining with TIERL
- Computer ScienceJ. Digit. Inf. Manag.
- 2010
A novel rule-based autonomous citation mining technique is proposed that is able to overcome limitations of current leading citation indexes such as ISI Web of Knowl- edge, Citeseer and Google Scholar and significantly enhances the correct discovery of citations.
Are Your Citations Clean ? New Scenarios and Challenges in Maintaining Digital Libraries
- Computer Science
- 2006
In many scientific-publication digital libraries (DLs) such as CiteSeer, arXiv e-Print, DBLP, or Google Scholar, “citations” play an important role and it is important for DLs to keep citations of stored documents consistent and up-to-date.
Effects of Unpopular Citation Fields in Citation Matching Performance
- Computer Science2011 International Conference on Information Science and Applications
- 2011
It is proposed that there is always the best combination of citation record fields that helps increase citation matching performance and is applicable regardless of which research framework one may adopt, such as Machine Learning methods or Information Retrieval algorithms.
References
SHOWING 1-10 OF 37 REFERENCES
AUTOMATIC INDEXING USING BIBLIOGRAPHIC CITATIONS
- Computer Science
- 1971
It is shown that the use of bibliographic citations in addition to the normal keyword‐type indicators produces improved retrieval performance, and that in some circumstances, citations are more effective for retrieval purposes than other more conventional terms and concepts.
Comparative citation rankings of authors in monographic and journal literature: a study of sociology
- MedicineJ. Documentation
- 1997
The study examined the scholarly literature of sociology and found that the relative rankings of authors who were highly cited in the monographic literature did not change in the journal literature of the same period, suggesting that there may be two distinct populations of highly cited authors.
Evidence of complex citer motivations
- Psychology
- 1986
There were 20 scholars interviewed about their citation motives in recently published articles. Their 437 citations were scaled along 1 or more of the following 7 citer motives: currency, negative…
Cited Documents as Concept Symbols
- Computer Science
- 1978
An interpretation of citation practice in scientific literature is offered which regards citation of a document as an act of symbol usage around the footnote number, and a high degree of uniformity is revealed in the association of specific concepts with specific documents.
On-the-fly Hyperlink Creation for Page Images
- Computer ScienceDL
- 1995
Using the World-Wide Web, a system for creating hypertext links on the fly in a library composed of bitmapped images of paper documents and text derived from those images by optical-character recognition is described.
Term-Weighting Approaches in Automatic Text Retrieval
- Computer ScienceInf. Process. Manag.
- 1988
An algorithm for suffix stripping
- LinguisticsProgram
- 1980
An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL and performs slightly better than a much more elaborate system with which it has been compared.
On the Specification of Term Values in Automatic Indexing
- Computer Science
- 1973
It is shown that the standard theories for the specification of term values (or weights) are not adequate, and new techniques are introduced for the assignment of weights to index terms, based on the characteristics of individual document collections.
Citation indexing: its theory and application in science
- Education, Art
- 1979
Citation indexing-its theory and application in science, technology, and humanities , Citation indexing-its theory and application in science, technology, and humanities , مرکز فناوری اطلاعات و اطلاع…
Data structures and algorithms for nearest neighbor search in general metric spaces
- Computer ScienceSODA '93
- 1993
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.