• Publications
  • Influence
CORD-19: The COVID-19 Open Research Dataset
TLDR
The mechanics of dataset construction are described, highlighting challenges and key design decisions, an overview of how CORD-19 has been used, and several shared tasks built around the dataset are described. Expand
Construction of the Literature Graph in Semantic Scholar
TLDR
This paper reduces literature graph construction into familiar NLP tasks, point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. Expand
A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
TLDR
The first public dataset of scientific peer reviews available for research purposes (PeerRead v1) is presented and it is shown that simple models can predict whether a paper is accepted with up to 21% error reduction compared to the majority baseline. Expand
Overview of the TREC 2019 Fair Ranking Track
TLDR
An overview of the TREC Fair Ranking track is presented, including the task definition, descriptions of the data and the annotation process, as well as a comparison of the performance of submitted systems. Expand
Mitigating Biases in CORD-19 for Analyzing COVID-19 Literature
TLDR
The results suggest that while CORD-19 exhibits a strong tilt toward recent and topically focused articles, the knowledge being explored to attack the pandemic encompasses a much longer time span and is very interdisciplinary. Expand