• Publications
  • Influence
ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
TLDR
We introduce scispaCy, a new Python library and models for practical biomedical/scientific text processing, which heavily leverages the spaCy library. Expand
  • 144
  • 26
  • PDF
Pretrained Language Models for Sequential Sentence Classification
TLDR
We show that pretrained language models, BERT (Devlin et al., 2018) in particular, can be used for this task to capture contextual dependencies without the need for hierarchical encoding. Expand
  • 26
  • 5
  • PDF
High-Precision Extraction of Emerging Concepts from Scientific Literature
TLDR
We present an unsupervised concept extraction method for scientific literature that achieves a Precision@1000 of 99%, compared to 86% for prior work. Expand
  • 1
  • PDF
S2AND: A Benchmark and Evaluation System for Author Name Disambiguation
TLDR
We present S2AND, a new dataset and benchmark for author disambiguation that unifies eight previous AND datasets into a uniform format, with a single rich feature set drawn from the Semantic Scholar database. Expand