• Publications
  • Influence
Construction of the Literature Graph in Semantic Scholar
TLDR
We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. Expand
  • 128
  • 17
  • PDF
A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
TLDR
We present the first public dataset of scientific peer reviews available for research purposes (PeerRead) providing an opportunity to study this important artifact. Expand
  • 60
  • 16
  • PDF
Structural Scaffolds for Citation Intent Classification in Scientific Publications
TLDR
We propose structural scaffolds, a multitask model to incorporate structural information of scientific papers into citations for effective classification of citation intents. Expand
  • 37
  • 6
  • PDF
Fact or Fiction: Verifying Scientific Claims
TLDR
We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that supports or refutes a given scientific claim, and to identify rationales justifying each decision. Expand
  • 24
  • 5
  • PDF
Apoptosis-related genes control autophagy and influence DENV-2 infection in the mosquito vector, Aedes aegypti.
The mosquito Aedes aegypti is the primary urban vector for dengue virus (DENV) worldwide. Insight into interactions occurring between host and pathogen is important in understanding what factorsExpand
  • 32
  • 2
SciREX: A Challenge Dataset for Document-Level Information Extraction
TLDR
We introduce SciREX, a document level IE dataset that encompasses multiple IE tasks, including salient entity identification and document level $N$-ary relation identification from scientific articles. Expand
  • 13
  • 2
  • PDF
Quantifying Sex Bias in Clinical Studies at Scale With Automated Data Extraction
Key Points Question What is the magnitude of female underrepresentation in clinical studies? Findings In this cross-sectional study, machine reading to extract sex data from 43 135 published articlesExpand
  • 18
MedICaT: A Dataset of Medical Images, Captions, and Textual References
TLDR
We introduce MEDICAT, a dataset of medical figures, captions, subfigures/subcaptions, and inline references that enables the study of these figures in context. Expand
Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
TLDR
The urgency of mitigating COVID-19 has spawned a large and diverse body of scientific literature that is challenging for researchers to navigate. Expand
  • 2
  • PDF