Identifying Meaningful Citations

Abstract

We introduce the novel task of identifying important citations in scholarly literature, i.e., citations that indicate that the cited work is used or extended in the new effort. We believe this task is a crucial component in algorithms that detect and follow research topics and in methods that measure the quality of publications. We model this task as a supervised classification problem at two levels of detail: a coarse one with classes (important vs. non-important), and a more detailed one with four importance classes. We annotate a dataset of approximately 450 citations with this information, and release it publicly. We propose a supervised classification approach that addresses this task with a battery of features that range from citation counts to where the citation appears in the body of the paper, and show that, our approach achieves a precision of 65% for a recall of 90%.

Extracted Key Phrases

9 Figures and Tables

0102020162017
Citations per Year

Citation Velocity: 13

Averaging 13 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@inproceedings{Valenzuela2015IdentifyingMC, title={Identifying Meaningful Citations}, author={Marco Valenzuela and Vu Ha and Oren Etzioni}, booktitle={AAAI Workshop: Scholarly Big Data}, year={2015} }