Yacine Jernite

Learn More
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convo-lutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on(More)
Language modelling is a fundamental building block of natural language processing. However, in practice the size of the vocabulary limits the distributions applicable for this task: specifically , one has to either resort to local optimization methods, such as those used in neural language models, or work with heavily constrained distributions. In this(More)
We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data.(More)
This paper addresses the problem of multi-class classification with an extremely large number of classes, where the class predic-tor is learned jointly with the data representation , as is the case in language modeling problems. The predictor admits a hierarchical structure, which allows for efficient handling of settings that deal with a very large number(More)
  • 1