• Publications
  • Influence
The Mythos of Model Interpretability
This research presents a meta-modelling architecture that automates the very labor-intensive and therefore time-heavy and expensive and therefore expensive and expensive process of training and deploying supervised machine-learning models. Expand
A Critical Review of Recurrent Neural Networks for Sequence Learning
The goal of this survey is to provide a selfcontained explication of the state of the art of recurrent neural networks together with a historical perspective and references to primary research. Expand
Learning to Diagnose with LSTM Recurrent Neural Networks
This first study to empirically evaluate the ability of LSTMs to recognize patterns in multivariate time series of clinical measurements considers multilabel classification of diagnoses, and establishes the effectiveness of a simple LSTM network for modeling clinical data. Expand
Born Again Neural Networks
This work studies KD from a new perspective: rather than compressing models, students are trained parameterized identically to their teachers, and shows significant advantages from transferring knowledge between DenseNets and ResNets in either direction. Expand
Stochastic Activation Pruning for Robust Adversarial Defense
Stochastic Activation Pruning (SAP) is proposed, a mixed strategy for adversarial defense that prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. Expand
Deep Active Learning for Named Entity Recognition
By combining deep learning with active learning, the authors can outperform classical methods even with a significantly smaller amount of training data, and this work shows otherwise. Expand
Detecting and Correcting for Label Shift with Black Box Predictors
Black Box Shift Estimation (BBSE) is proposed to estimate the test distribution of p(y) and it is proved BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible. Expand
Learning the Difference that Makes a Difference with Counterfactually-Augmented Data
This paper focuses on natural language processing, introducing methods and resources for training models less sensitive to spurious patterns, and task humans with revising each document so that it accords with a counterfactual target label and retains internal coherence. Expand
Learning Robust Global Representations by Penalizing Local Predictive Power
A method for training robust convolutional networks by penalizing the predictive power of the local representations learned by earlier layers, which forces networks to discard predictive signals such as color and texture that can be gleaned from local receptive fields and to rely instead on the global structures of the image. Expand
Optimal Thresholding of Classifiers to Maximize F1 Measure
This paper provides new insight into maximizing F1 measures in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall,Expand