• Publications
  • Influence
CORD-19: The COVID-19 Open Research Dataset
TLDR
The mechanics of dataset construction are described, highlighting challenges and key design decisions, an overview of how CORD-19 has been used, and several shared tasks built around the dataset are described. Expand
Sequential Neural Networks as Automata
TLDR
This work first defines what it means for a real-time network with bounded precision to accept a language and defines a measure of network memory, which helps explain neural computation, as well as the relationship between neural networks and natural language grammar. Expand
Context-Free Transductions with Neural Stacks
TLDR
It is shown that stack-augmented RNNs can discover intuitive stack-based strategies for solving tasks, and more complex networks often find approximate solutions by using the stack as unstructured memory. Expand
End-to-End Graph-Based TAG Parsing with Neural Networks
TLDR
This work presents a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs, and demonstrates that the proposed parser achieves state-of-the-art performance in the downstream tasks of Parsing Evaluation using Textual Entailments (PETE) and Unbounded Dependency Recovery. Expand
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?
TLDR
This work formalizes ways in which ungrounded language models appear to be fundamentally limited in their ability to “understand”, and suggests that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. Expand
A Formal Hierarchy of RNN Architectures
TLDR
It is hypothesized that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy, and empirical results to support this conjecture are provided. Expand
Competency Problems: On Finding and Removing Artifacts in Language Data
TLDR
This work argues that for complex language understanding tasks, all simple feature correlations are spurious, and formalizes this notion into a class of problems which are called competency problems, and gives a simple statistical test for dataset artifacts that is used to show more subtle biases. Expand
On the Linguistic Capacity of Real-Time Counter Automata
TLDR
This work studies the abilities of real-time counter machines as formal grammars, focusing on formal properties that are relevant for NLP models and makes general contributions to the theory of formal languages that are of potential interest for understanding recurrent neural networks. Expand
Parameter Norm Growth During Training of Transformers
TLDR
The tendency of transformer parameters to grow in magnitude during training is studied to find that in certain contexts, GD increases the parameter $L_2$ norm up to a threshold that itself increases with training-set accuracy, which means increasing training accuracy over time enables the norm to increase. Expand
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent
TLDR
The tendency for transformer parameters to grow in magnitude during training, and its implications for the emergent representations within self attention layers are studied, suggesting saturation is a new characterization of an inductive bias implicit in GD of particular interest for NLP. Expand
...
1
2
...