#### Filter Results:

- Full text PDF available (12)

#### Publication Year

2013

2017

- This year (4)
- Last 5 years (13)
- Last 10 years (13)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush
- AAAI
- 2016

We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on… (More)

- Yacine Jernite, Edouard Grave, Armand Joulin, Tomas Mikolov
- ArXiv
- 2016

Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data. Much of this progress has been achieved through devising recurrent units and architectures with the flexibility to capture complex statistics in the data, such as long range dependency or localized attention phenomena. However,… (More)

As hospitals increasingly use electronic medical records for research and quality improvement, it is important to provide ways to structure medical data without losing either expressiveness or time. We present a system that helps achieve this goal by building an extended ontology of chief complaints and automatically predicting a patient’s chief complaint,… (More)

- Yacine Jernite, Alexander M. Rush, David Sontag
- ICML
- 2015

Language modelling is a fundamental building block of natural language processing. However, in practice the size of the vocabulary limits the distributions applicable for this task: specifically, one has to either resort to local optimization methods, such as those used in neural language models, or work with heavily constrained distributions. In this work,… (More)

- Yacine Jernite, Yonatan Halpern, David Sontag
- NIPS
- 2013

We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data.… (More)

In the last two decades many random graph models have been proposed to extract knowledge from networks. Most of them look for communities or, more generally, clusters of vertices with homogeneous connection profiles. While the first models focused on networks with binary edges only, extensions now allow to deal with valued networks. Recently, new models… (More)

- Yacine Jernite, Samuel R. Bowman, David Sontag
- ArXiv
- 2017

This work presents a novel objective function for the unsupervised training of neural network sentence encoders. It exploits signals from paragraph-level discourse coherence to train these models to understand text. Our objective is purely discriminative, allowing us to train models many times faster than was possible under prior methods, and it yields… (More)

- Steven Horng, David A Sontag, Yoni Halpern, Yacine Jernite, Nathan I Shapiro, Larry A Nathanson
- PloS one
- 2017

OBJECTIVE
To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department.
METHODS
This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consecutive ED patient visits between… (More)

- Yacine Jernite, Anna Choromanska, David Sontag, Yann LeCun
- ArXiv
- 2016

- Yacine Jernite, Anna Choromanska, David Sontag
- ICML
- 2017

We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time. The predictive power of such models can heavily depend on the structure of the tree, and although past work showed how to learn the tree structure, it expected that the feature vectors remained… (More)