Structural Scaffolds for Citation Intent Classification in Scientific Publications

  title={Structural Scaffolds for Citation Intent Classification in Scientific Publications},
  author={Arman Cohan and Waleed Ammar and Madeleine van Zuylen and Field Cady},
  booktitle={North American Chapter of the Association for Computational Linguistics},
Identifying the intent of a citation in scientific papers (e.g., background information, use of methods, comparing results) is critical for machine reading of individual publications and automated analysis of the scientific literature. [] Key Method Our model achieves a new state-of-the-art on an existing ACL anthology dataset (ACL-ARC) with a 13.3% absolute increase in F1 score, without relying on external linguistic resources or hand-engineered features as done in existing methods.

Figures and Tables from this paper

GraphCite: Citation Intent Classification in Scientific Publications via Graph Embeddings

The model, GraphCite, is presented, which improves significantly upon models that take into consideration only the citation phrase and also considers the citation graph, leveraging high-level information of citation patterns.

Additional Context Helps ! Leveraging Cited Paper Information to Improve Citation Classification

This work proposes a neural multi-task learning framework that harnesses the structural information of the research papers and the relation between the citation context and the cited paper for citation classification and achieves a new state of the art on the ACL-ARC dataset.

Structured Semantic Modeling of Scientific Citation Intents

This work proposes CiTelling : a radically new model of fine-grained semantic structures lying behind citational sentences able to represent their intent and features and achieves high inter-annotator agreement and state-of-the-art classification results with straightforward neural network models.

A meta-analysis of semantic classification of citations

This literature review investigates the approaches for characterizing citations based on their semantic type and explores the existing classification schemes, data sets, preprocessing methods, extraction of contextual and noncontextual features, and the different types of classifiers and evaluation approaches.

Dynamic Context Extraction for Citation Classification

A new automated unsupervised approach for the selection of a dynamic-size and potentially non-contiguous citation context, which utilises the transformer-based document representations and embedding similarities, is introduced.

Towards employing native information in citation function classification

This paper extracts and integrates all of the native information features of citation instances with different function labels into different neural text representation models via trainable embeddings and free text, and proposes to exploit the recently developedText representation models integrated with such information to evaluate the performance of citation function classification task.

Citation Intent Classification Using Word Embedding

This study critically investigated the available datasets for citation intent and proposed an automated citation intent technique to label the citation context with citation intent, which will enhance the study of citation context analysis.

Classification of URL Citations in Scholarly Papers for Promoting Utilization of Research Artifacts

The classification task for each URL citation is addressed to identify the role that the referenced resources play in research activities, the type of referenced resources, and the reason why the author cited the resources.

ImpactCite: An XLNet-based method for Citation Impact Analysis

Evaluation results reveal that ImpactCite achieves a new state-of-the-art performance for both citation intent and sentiment classification by outperforming the existing approaches by 3.44% and 1.33% in F1-score.

Citation Worthiness Identification for Fine-Grained Citation Recommendation Systems

  • Meysam Roostaee
  • Materials Science
    Iranian Journal of Science and Technology, Transactions of Electrical Engineering
  • 2022
Citing properly in order to support concepts, claims and arguments is one of the main requirements of writing any scientific text. However, manual analysis of the input text to identify potential



Purpose and Polarity of Citation: Towards NLP-based Bibliometrics

This paper analyzes the text that accompanies citations in scientific articles (which the authors term citation context) and proposes supervised methods for identifying citation text and analyzing it to determine the purpose and the polarity of citation.

Automatically classifying the role of citations in biomedical articles.

The development of an eight-category classification scheme, annotation using that scheme, and development and evaluation of supervised machine-learning classifiers using the annotated data are reported on.

Automatic classification of citation function

This work shows that the annotation scheme for citation function is reliable, and presents a supervised machine learning framework to automatically classify citation function, using both shallow and linguistically-inspired features, finding a strong relationship between citation function and sentiment classification.

Scientific Article Summarization Using Citation-Context and Article’s Discourse Structure

It is shown that the proposed summarization approach for scientific articles which takes advantage of citation-context and the document discourse model effectively improves over existing summarization approaches (greater than 30% improvement over the best performing baseline) in terms of ROUGE scores on TAC2014 scientific summarization dataset.

Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge

An unsupervised model that uses distributed representation of words as well as domain knowledge to extract the appropriate context from the reference paper is proposed and demonstrated how an effective contextualization method results in improving citation-based summarization of the scientific articles.

Ensemble-style Self-training on Citation Classification

This work builds an ensemble-style selftraining classification model and gets better classification performance using only few training data, which largely reduces the manual annotation work in this task.

Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach

A system that identifies text spans in the reference article that are related to a given citance that is equivalent to citance-reference spans matching is proposed, and a comparison of different citance reformulation methods and their combinations is detailed.

Citation context analysis for information retrieval

The main hypothesis that citation terms enhance a full-text representation of scientific papers is proven and the construction of a new, realistic test collection of scientific research papers is documented, with references and associated citations automatically annotated.

A New Approach for Scientific Citation Classification Using Cue Phrases

The method is based on Ripple-Down Rules, a knowledge acquisition method that proved very successful in practice for building medical expert systems and does not require a knowledge engineer to be implemented.