A Primer on Neural Network Models for Natural Language Processing

@article{Goldberg2016APO,
  title={A Primer on Neural Network Models for Natural Language Processing},
  author={Yoav Goldberg},
  journal={ArXiv},
  year={2016},
  volume={abs/1510.00726}
}
Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with… 
An analysis of convolutional neural networks for sentence classification
TLDR
A series of experiments with Convolutional Neural Networks for sentence-level classification tasks with different hyperparameter settings and how sensitive model performance is to changes in these configurations are shown.
WORD BASED PREDICTION USING LSTM NEURAL NETWORKS FOR LANGUAGE MODELING
  • 2021
Neural networks have become increasingly popular for the task of language modeling. Whereas feedforward networks only exploit a fixed context length to predict the next word of a sequence,
Neural Network Methods for Natural Language Processing
TLDR
This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models.
Natural language generation as neural sequence learning and beyond
TLDR
A deep reinforcement learning framework is proposed to inject prior knowledge into neural based NLG models and apply it to sentence simplification, and a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences is proposed.
Empirical Exploration of Novel Architectures and Objectives for Language Models
TLDR
An empirical comparison of LSTM and CNN language models on English broadcast news and various conversational telephone speech transcription tasks is presented and a novel criterion for training language models that combines word and class prediction in a multi-task learning framework is proposed.
Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing
Neural networks have led to state-of-the-art results in fields such as natural language processing [3] and computer vision. However, their lack of interpretability remains a persistent and serious
Natural Language Understanding with Distributed Representation
TLDR
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.
Neural information retrieval: at the end of the early years
TLDR
The successes of neural IR thus far are highlighted, obstacles to its wider adoption are cataloged, and potentially promising directions for future research are suggested.
Survey of Neural Text Representation Models
TLDR
This survey systematize and analyze 50 neural models from the last decade, focusing on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models.
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
TLDR
It is shown how traditional symbolic statistical machine translation models can still improve neural machine translation while reducing the risk of common pathologies of NMT such as hallucinations and neologisms.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 200 REFERENCES
LSTM Neural Networks for Language Modeling
TLDR
This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.
Statistical Language Models Based on Neural Networks
TLDR
Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup.
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Natural Language Understanding with Distributed Representation
TLDR
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.
Character-Aware Neural Language Models
TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
A unified architecture for natural language processing: deep neural networks with multitask learning
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic
Learning Longer Memory in Recurrent Neural Networks
TLDR
This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.
Joint Language and Translation Modeling with Recurrent Neural Networks
TLDR
This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features.
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
...
1
2
3
4
5
...