Corpus ID: 17964596

Neural Machine Translation and Sequence-to-sequence Models: A Tutorial

@article{Neubig2017NeuralMT,
  title={Neural Machine Translation and Sequence-to-sequence Models: A Tutorial},
  author={Graham Neubig},
  journal={ArXiv},
  year={2017},
  volume={abs/1703.01619}
}
This tutorial introduces a new and powerful set of techniques variously called "neural machine translation" or "neural sequence-to-sequence models". These techniques have been used in a number of tasks regarding the handling of human language, and can be a powerful tool in the toolbox of anyone who wants to model sequential data of some sort. The tutorial assumes that the reader knows the basics of math and programming, but does not assume any particular experience with neural networks or… Expand
Neural Machine Translation for English to Hindi
  • Sandeep Saini, V. Sahula
  • Computer Science
  • 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP)
  • 2018
TLDR
Different configurations for setting up a Neural Machine Translation System for Indian language Hindi are explored and it is observed that NMT requires very less amount of data size for training and thus exhibits satisfactory translation for few thousands of training sentences as well. Expand
Neural Machine Translation: A Review
TLDR
This work traces back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family and concludes with a survey of recent trends in the field. Expand
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
TLDR
It is shown how traditional symbolic statistical machine translation models can still improve neural machine translation while reducing the risk of common pathologies of NMT such as hallucinations and neologisms. Expand
Is statistical machine translation approach dead
TLDR
This article aims to describe some of powerful and advanced techniques proposed to improve the NMT system and to compare them with the conventional SMT approach on the task of Arabic-English machine translation. Expand
Preference learning for machine translation
TLDR
Algorithms that can learn from very large amounts of data by exploiting pairwise preferences defined over competing translations are developed, which can be used to make a machine translation system robust to arbitrary texts from varied sources, but also enable it to learn effectively to adapt to new domains of data. Expand
Character Based Language Modeling and Applications in Speech Recognition
Character based approaches have recently attracted a lot of interest for language modeling tasks as they remove some of the drawbacks of word based models. They do not make use of a fixed sizeExpand
New resources and ideas for semantic parser induction
TLDR
This thesis investigates the idea of using source code libraries, or collections of code and text documentation, as a parallel corpus for semantic parser development and introduces 45 new datasets in this domain and a new and challenging text-to-code translation task. Expand
Permutation forests for modeling word order in machine translation
TLDR
This dissertation has two main advantages: first, it allows for applying hierarchical reordering even on languages for which there are no syntactic parsers available, and second, unlike the trees used in syntactic transfer which in some cases cannot cover the reordering patterns present in the data, the trees in this work are built directly over the re ordering patterns, so they can cover them by definition. Expand
On Compositionality in Neural Machine Translation
TLDR
It is quantitatively demonstrate that inadequate temporal processing, in the form of poor encoder representations is a bottleneck for both Productivity and Systematicity. Expand
First Experiments with Neural Translation of Informal to Formal Mathematics
TLDR
The experiments to train deep neural networks that automatically translate informalized LaTeX-written Mizar texts into the formal Mizar language indicate that formalization through artificial neural network is a promising approach for automated formalization of mathematics. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 130 REFERENCES
Tree-to-Sequence Attentional Neural Machine Translation
TLDR
This work proposes a novel end-to-end syntactic NMT model, extending a sequence- to-sequence model with the source-side phrase structure, which has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence. Expand
Neural Machine Translation of Rare Words with Subword Units
TLDR
This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU. Expand
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamentalExpand
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
TLDR
GNMT, Google's Neural Machine Translation system, is presented, which attempts to address many of the weaknesses of conventional phrase-based translation systems and provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delicited models. Expand
Whole-sentence exponential language models: a vehicle for linguistic-statistical integration
TLDR
An exponential language model which models a whole sentence or utterance as a single unit is introduced, and a novel procedure for feature selection is presented, which exploits discrepancies between the existing model and the training corpus. Expand
Sequence Level Training with Recurrent Neural Networks
TLDR
This work proposes a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE, and outperforms several strong baselines for greedy generation. Expand
Neural Machine Translation by Jointly Learning to Align and Translate
TLDR
It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. Expand
Neural Machine Translation in Linear Time
TLDR
The ByteNet decoder attains state-of-the-art performance on character-level language modelling and outperforms the previous best results obtained with recurrent networks and the latent alignment structure contained in the representations reflects the expected alignment between the tokens. Expand
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand
Incorporating Discrete Translation Lexicons into Neural Machine Translation
TLDR
A method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on is described. Expand
...
1
2
3
4
5
...