Yacine Jernite

Learn More
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convo-lutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on(More)
As hospitals increasingly use electronic medical records for research and quality improvement, it is important to provide ways to structure medical data without losing either expressiveness or time. We present a system that helps achieve this goal by building an extended ontology of chief complaints and automatically predicting a patient's chief complaint,(More)
Language modelling is a fundamental building block of natural language processing. However, in practice the size of the vocabulary limits the distributions applicable for this task: specifically , one has to either resort to local optimization methods, such as those used in neural language models, or work with heavily constrained distributions. In this(More)
We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data.(More)
Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data. Much of this progress has been achieved through devising recurrent units and architectures with the flexibility to capture complex statistics in the data, such as long range dependency or localized attention phenomena. However,(More)
In the last two decades many random graph models have been proposed to extract knowledge from networks. Most of them look for communities or, more generally, clusters of vertices with homogeneous connection profiles. While the first models focused on networks with binary edges only, extensions now allow to deal with valued networks. Recently, new models(More)
OBJECTIVE To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department. METHODS This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consecutive ED patient visits between(More)
This paper addresses the problem of multi-class classification with an extremely large number of classes, where the class predic-tor is learned jointly with the data representation , as is the case in language modeling problems. The predictor admits a hierarchical structure, which allows for efficient handling of settings that deal with a very large number(More)
  • 1