• Publications
  • Influence
Semi-supervised sequence tagging with bidirectional language models
Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates onExpand
  • 326
  • 36
  • PDF
Many Languages, One Parser
We train one multilingual model for dependency parsing and use it to parse sentences in several languages. The parsing model uses (i) multilingual word clusters and embeddings; (ii) token-levelExpand
  • 158
  • 35
  • PDF
DyNet: The Dynamic Neural Network Toolkit
We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK,Expand
  • 304
  • 26
  • PDF
Massively Multilingual Word Embeddings
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space. Our estimation methods, multiCluster and multiCCA, useExpand
  • 195
  • 26
  • PDF
ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
Despite recent advances in natural language processing, many statistical models for processing text perform extremely poorly under domain shift. Processing biomedical and clinical text is aExpand
  • 85
  • 14
  • PDF
A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications
Peer reviewing is a central component in the scientific publishing process. We present the first public dataset of scientific peer reviews available for research purposes (PeerRead v1) providing anExpand
  • 39
  • 11
  • PDF
Content-Based Citation Recommendation
We present a content-based method for recommending citations in an academic paper draft. We embed a given query document into a vector space, then use its nearest neighbors as candidates, and rerankExpand
  • 38
  • 10
  • PDF
Conditional Random Field Autoencoders for Unsupervised Structured Prediction
We introduce a framework for unsupervised learning of structured predictors with overlapping, global features. Each input's latent representation is predicted conditional on the observed data using aExpand
  • 64
  • 10
  • PDF
Construction of the Literature Graph in Semantic Scholar
We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graphExpand
  • 89
  • 8
  • PDF
Automatic scoring of online discussion posts
Online discussions forums, known as forums for short, are conversational social cyberspaces constituting rich repositories of content and an important source of collaborative knowledge. However, mostExpand
  • 68
  • 7
  • PDF