A Primer on Neural Network Models for Natural Language Processing

@article{Goldberg2016APO,
  title={A Primer on Neural Network Models for Natural Language Processing},
  author={Yoav Goldberg},
  journal={ArXiv},
  year={2016},
  volume={abs/1510.00726}
}
Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with… Expand
An analysis of convolutional neural networks for sentence classification
TLDR
A series of experiments with Convolutional Neural Networks for sentence-level classification tasks with different hyperparameter settings and how sensitive model performance is to changes in these configurations are shown. Expand
Neural Network Methods for Natural Language Processing
TLDR
This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models. Expand
Natural language generation as neural sequence learning and beyond
TLDR
A deep reinforcement learning framework is proposed to inject prior knowledge into neural based NLG models and apply it to sentence simplification, and a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences is proposed. Expand
Empirical Exploration of Novel Architectures and Objectives for Language Models
TLDR
An empirical comparison of LSTM and CNN language models on English broadcast news and various conversational telephone speech transcription tasks is presented and a novel criterion for training language models that combines word and class prediction in a multi-task learning framework is proposed. Expand
Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing
Neural networks have led to state-of-the-art results in fields such as natural language processing [3] and computer vision. However, their lack of interpretability remains a persistent and seriousExpand
Natural Language Understanding with Distributed Representation
TLDR
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced. Expand
Neural information retrieval: at the end of the early years
TLDR
The successes of neural IR thus far are highlighted, obstacles to its wider adoption are cataloged, and potentially promising directions for future research are suggested. Expand
Survey of Neural Text Representation Models
TLDR
This survey systematize and analyze 50 neural models from the last decade, focusing on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models. Expand
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
TLDR
It is shown how traditional symbolic statistical machine translation models can still improve neural machine translation while reducing the risk of common pathologies of NMT such as hallucinations and neologisms. Expand
Rational Recurrences
TLDR
It is shown that several recent neural models use rational recurrences, and one such model is presented, which performs better than two recent baselines on language modeling and text classification and demonstrates that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 200 REFERENCES
LSTM Neural Networks for Language Modeling
TLDR
This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system. Expand
Statistical Language Models Based on Neural Networks
TLDR
Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup. Expand
Natural Language Understanding with Distributed Representation
TLDR
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced. Expand
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Expand
A unified architecture for natural language processing: deep neural networks with multitask learning
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semanticExpand
Character-Aware Neural Language Models
TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling. Expand
Learning Longer Memory in Recurrent Neural Networks
TLDR
This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture. Expand
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Joint Language and Translation Modeling with Recurrent Neural Networks
TLDR
This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features. Expand
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand
...
1
2
3
4
5
...