Learning to Predict Citation-Based Impact Measures

  title={Learning to Predict Citation-Based Impact Measures},
  author={Luca Weihs and Oren Etzioni},
  journal={2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL)},
  • Luca Weihs, Oren Etzioni
  • Published 1 June 2017
  • Computer Science
  • 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
Citations implicitly encode a community's judgment of a paper's importance and thus provide a unique signal by which to study scientific impact. Efforts in understanding and refining this signal are reflected in the probabilistic modeling of citation networks and the proliferation of citation-based impact measures such as Hirsch's h-index. While these efforts focus on understanding the past and present, they leave open the question of whether scientific impact can be predicted into the future… 

Figures and Tables from this paper

Impact-Based Ranking of Scientific Publications: A Survey and Experimental Evaluation

This work provides explicit definitions for short-term and long-term impact, and introduces the associated ranking problems, and proposes a specific benchmark framework that can help to differentiate effectiveness across impact aspects.

Citation Count Prediction of Academic Papers

A deep learning model is built that predicts whether a paper will receive at least one citation in a one-year interval after its publication, and employs Long ShortTerm Memory (LSTM) to capture the relationship between word sequences.

How can I improve my scientific impact? The most influential factors in predicting the h-index

This study used machine learning methods to predict the h-index and feature analysis techniques to advance the understanding of feature impact and found that non-impact-based features are more robust predictors for younger scholars than seniors in the short term.

Can Author Collaboration Reveal Impact? The Case of h-index

The experiments indicate that there is indeed some relationship between the future h-index of an author and their structural role in the co-authorship network and it is found that the proposed method outperforms standard machine learning techniques based on simple graph metrics along with node representations learnt from the textual content of the author’s papers.

On the Usage of Rank Percentile in Evaluating and Predicting Scientific Impacts

The basic idea is to create benchmarks and then utilize percentile indicators to measure the performance of a scholar or publication over time and it is demonstrated that the rank percentile indicators have reasonable predictive power.

The Science of Science and a Multilayer Network Approach to Scientists' Ranking

This article provides a comprehensive coverage of recent advances in SoS related to network analysis, prediction and ranking, and investigates the issue of scientist ranking from a multilayer network perspective.

BIBTEX-based dataset generation for training citation parsers

An important part of supervised learning is a good dataset of ground truth — in this case, a large amount of already parsed citations both as text strings and key-value pairs, which are generated using different bibliography styles.

CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations

CiteTracked is presented, a dataset of peer reviews and citation statistics covering scientific papers from the machine learning community and spanning six years that aims at fertilizing novel interdisciplinary work between fields such as scientometrics, information retrieval, computational linguistics and natural language processing.

On the Predictability of Utilizing Rank Percentile to Evaluate Scientific Impact

This paper proposes and justifies a novel rank percentile indicator for scholars, and demonstrates its advantage over traditional rank percentiles based on the existing bibliographic metrics and the predictability of the rank percentile is highly predictable.



Quantifying Long-Term Scientific Impact

A mechanistic model is derived for the citation dynamics of individual papers, allowing us to collapse the citation histories of papers from different journals and disciplines into a single curve, indicating that all papers tend to follow the same universal temporal pattern.

Will This Paper Increase Your h-index?

A model to predict authors' future h-indices based on their current scientific impact is developed and an online tool is developed that allows users to generate informed h-index predictions.

Towards a stratified learning approach to predict future citation counts

This paper proposes a two-stage prediction model - in the first stage, the model maps a query paper into one of the six categories, and then in the second stage a regression module is run only on the subpopulation corresponding to that category to predict the future citation count of the query paper.

Predicting citation counts of papers

  • Junpeng ChenChunxia Zhang
  • Computer Science
    2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)
  • 2015
This paper proposes two types of predictive features to represent fundamental characteristics of papers and authors: six content features and ten author features, and introduces the IBM Model 1 to calculate the association probabilities between paper topics which are employed to extract content features.

Measuring academic influence: Not all citations are equal

The hip‐index, a model for predicting academic influence that achieves good performance on this data set using only four features, was found, among those evaluated, those based on the number of times a reference is mentioned in the body of a citing paper.

The first-mover advantage in scientific publication

It is suggested that some later-published papers, albeit only a small fraction, that buck the trend and attract significantly more citations than theory predicts are probably worthy of the authors' attention.

Estimating Number of Citations Using Author Reputation

This work shows how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper using author information and monitoring the items of interest for a short period of time after their creation.

To better stand on the shoulder of giants

This work learns to identify potentially influential literature via Future Influence Prediction (FIP), which aims to estimate the future influence of literature, and applies the learned model to the application of bibliography recommendation and obtains prominent performance improvement in terms of Mean Average Precision (MAP).

On the Predictability of Future Impact in Science

By applying a future impact model to 762 careers drawn from three disciplines: physics, biology, and mathematics, this work identifies a number of subtle, but critical, flaws in current models.

Predicting scientific success based on coauthorship networks

It is shown that a Machine Learning classifier, based only on coauthorship network centrality metrics measured at the time of publication, is able to predict with high precision whether an article will be highly cited five years after publication.