• Publications
  • Influence
MIMIC-III, a freely accessible critical care database
MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
TextFooler is presented, a simple but strong baseline to generate adversarial text that outperforms previous attacks by success rate and perturbation rate, and is utility-preserving and efficient, which generates adversarialtext with computational complexity linear to the text length.
What Is a Knowledge Representation?
It is argued that keeping in mind all five of these roles that a representation plays provides a usefully broad perspective that sheds light on some longstanding disputes and can invigorate both research and practice in the field.
Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment
The TextFooler is presented, a general attack framework, to generate natural adversarial texts that outperforms state-of-the-art attacks in terms of success rate and perturbation rate.
De-identification of patient notes with recurrent neural networks
The first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems, is introduced, which outperforms the state-of-the-art systems.
Viewpoint Paper: Evaluating the State-of-the-Art in Automatic De-identification
An overview of this de-identification challenge is provided, the data and the annotation process are described, the evaluation metrics are explained, the nature of the systems that addressed the challenge are discussed, the results of received system runs are analyzed, and directions for future research are identified.
Automated de-identification of free-text medical records
An automated Perl-based de-identification software package that is generally usable on most free-text medical records, e.g., nursing notes, discharge summaries, X-ray reports, etc, and is sufficiently generalized and can be customized to handle text files of any format is described.
NeuroNER: an easy-to-use program for named-entity recognition based on neural networks
NeuroNER is an easy-to-use named-entity recognition tool based on ANNs that can annotate entities using a graphical web-based user interface (BRAT) and be used to train an ANN, which in turn predict entities’ locations and categories in new texts.
Clinically Accurate Chest X-Ray Report Generation
A domain-aware automatic chest X-ray radiology report generation system which first predicts what topics will be discussed in the report, then conditionally generates sentences corresponding to these topics, and is fine-tuned using reinforcement learning.
A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data
This work evaluates the use of multivariate timeseries modeling with the multi-task Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenly-sampled clinical data, including both physiological signals and clinical notes to assess and forecast patient acuity.