Structural Scaffolds for Citation Intent Classification in Scientific Publications

@inproceedings{Cohan2019StructuralSF,
  title={Structural Scaffolds for Citation Intent Classification in Scientific Publications},
  author={Arman Cohan and Waleed Ammar and Madeleine van Zuylen and Field Cady},
  booktitle={NAACL},
  year={2019}
}
Identifying the intent of a citation in scientific papers (e.g., background information, use of methods, comparing results) is critical for machine reading of individual publications and automated analysis of the scientific literature. [] Key Method Our model achieves a new state-of-the-art on an existing ACL anthology dataset (ACL-ARC) with a 13.3% absolute increase in F1 score, without relying on external linguistic resources or hand-engineered features as done in existing methods.

Figures and Tables from this paper

GraphCite: Citation Intent Classification in Scientific Publications via Graph Embeddings

TLDR
The model, GraphCite, is presented, which improves significantly upon models that take into consideration only the citation phrase and also considers the citation graph, leveraging high-level information of citation patterns.

Additional Context Helps ! Leveraging Cited Paper Information to Improve Citation Classification

TLDR
This work proposes a neural multi-task learning framework that harnesses the structural information of the research papers and the relation between the citation context and the cited paper for citation classification and achieves a new state of the art on the ACL-ARC dataset.

Structured Semantic Modeling of Scientific Citation Intents

TLDR
This work proposes CiTelling : a radically new model of fine-grained semantic structures lying behind citational sentences able to represent their intent and features and achieves high inter-annotator agreement and state-of-the-art classification results with straightforward neural network models.

A meta-analysis of semantic classification of citations

TLDR
This literature review investigates the approaches for characterizing citations based on their semantic type and explores the existing classification schemes, data sets, preprocessing methods, extraction of contextual and noncontextual features, and the different types of classifiers and evaluation approaches.

Citation Intent Classification Using Word Embedding

TLDR
This study critically investigated the available datasets for citation intent and proposed an automated citation intent technique to label the citation context with citation intent, which will enhance the study of citation context analysis.

IITP-CUNI@3C: Supervised Approaches for Citation Classification (Task A) and Citation Significance Detection (Task B)

TLDR
This paper presents the team, IITP-CUNI@3C’s submission to the 3C shared tasks, and proposes a neural multi-task learning framework that harnesses the structural information of the research papers and the relation between the citation context and the cited paper for citation classification.

ImpactCite: An XLNet-based method for Citation Impact Analysis

TLDR
Evaluation results reveal that ImpactCite achieves a new state-of-the-art performance for both citation intent and sentiment classification by outperforming the existing approaches by 3.44% and 1.33% in F1-score.

scite: a smart citation index that displays the context of citations and classifies their intent using deep learning

TLDR
A “smart citation index” called scite is developed, which categorizes citations based on context and shows how a citation was used by displaying the surrounding textual context from the citing paper, and a classification from the deep learning model that indicates whether the statement provides supporting or disputing evidence for a referenced work.

Articles for Discovering Citation Intent Classification

TLDR
Overall, SVM performed best on both of the datasets, followed by the stochastic gradient descent classifier; therefore, S VM can produce good results as text classification on top of contextual word embedding.

An Authoritative Approach to Citation Classification

TLDR
It is argued that authors themselves are in a primary position to answer the question of why something was cited, and a new methodology for annotating citations is introduced and a significant new dataset of 11,233 citations annotated by 883 authors is introduced.
...

References

SHOWING 1-10 OF 34 REFERENCES

Purpose and Polarity of Citation: Towards NLP-based Bibliometrics

TLDR
This paper analyzes the text that accompanies citations in scientific articles (which the authors term citation context) and proposes supervised methods for identifying citation text and analyzing it to determine the purpose and the polarity of citation.

Automatically classifying the role of citations in biomedical articles.

TLDR
The development of an eight-category classification scheme, annotation using that scheme, and development and evaluation of supervised machine-learning classifiers using the annotated data are reported on.

Measuring the Evolution of a Scientific Field through Citation Frames

TLDR
This work performs the largest behavioral study of citations to date, analyzing how scientific works frame their contributions through different types of citations and how this framing affects the field as a whole and changes in citation framing are used to show that the field of NLP is undergoing a significant increase in consensus.

Automatic classification of citation function

TLDR
This work shows that the annotation scheme for citation function is reliable, and presents a supervised machine learning framework to automatically classify citation function, using both shallow and linguistically-inspired features, finding a strong relationship between citation function and sentiment classification.

Scientific Article Summarization Using Citation-Context and Article’s Discourse Structure

TLDR
It is shown that the proposed summarization approach for scientific articles which takes advantage of citation-context and the document discourse model effectively improves over existing summarization approaches (greater than 30% improvement over the best performing baseline) in terms of ROUGE scores on TAC2014 scientific summarization dataset.

Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge

TLDR
An unsupervised model that uses distributed representation of words as well as domain knowledge to extract the appropriate context from the reference paper is proposed and demonstrated how an effective contextualization method results in improving citation-based summarization of the scientific articles.

Ensemble-style Self-training on Citation Classification

TLDR
This work builds an ensemble-style selftraining classification model and gets better classification performance using only few training data, which largely reduces the manual annotation work in this task.

Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach

TLDR
A system that identifies text spans in the reference article that are related to a given citance that is equivalent to citance-reference spans matching is proposed, and a comparison of different citance reformulation methods and their combinations is detailed.

Citation context analysis for information retrieval

TLDR
The main hypothesis that citation terms enhance a full-text representation of scientific papers is proven and the construction of a new, realistic test collection of scientific research papers is documented, with references and associated citations automatically annotated.