#### Filter Results:

#### Publication Year

2013

2016

#### Co-author

#### Key Phrase

#### Publication Venue

#### Data Set Used

Learn More

We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convo-lutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on… (More)

Language modelling is a fundamental building block of natural language processing. However, in practice the size of the vocabulary limits the distributions applicable for this task: specifically , one has to either resort to local optimization methods, such as those used in neural language models, or work with heavily constrained distributions. In this… (More)

We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data.… (More)

This paper addresses the problem of multi-class classification with an extremely large number of classes, where the class predic-tor is learned jointly with the data representation , as is the case in language modeling problems. The predictor admits a hierarchical structure, which allows for efficient handling of settings that deal with a very large number… (More)

- ‹
- 1
- ›