fairseq: A Fast, Extensible Toolkit for Sequence Modeling

@inproceedings{Ott2019fairseqAF,
  title={fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author={Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle={NAACL},
  year={2019}
}
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. [] Key Method We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found at this https URL

Figures and Tables from this paper

Seq2SeqPy: A Lightweight and Customizable Toolkit for Neural Sequence-to-Sequence Modeling
TLDR
Seq2SeqPy is presented a lightweight toolkit for sequence-to-sequence modeling that prioritizes simplicity and ability to customize the standard architectures easily and it is shown that the toolkit performs similarly or even better than a very widely used sequence- to-sequence toolkit.
FastSeq: Make Sequence Generation Faster
TLDR
The proposed optimization techniques include an attention cache optimization, an efficient algorithm for detecting repeated n-grams, and an asynchronous generation pipeline with parallel I/O that are general enough to be applicable to Transformer-based models.
NaturalCC: A Toolkit to Naturalize the Source Code Corpus
TLDR
NaturalCC is an efficient and extensible toolkit to bridge the gap between natural language and programming language, and facilitate the research on big code analysis, and is built upon Fairseq and PyTorch.
The OpenNMT Neural Machine Translation Toolkit: 2020 Edition
TLDR
OpenNMT is a multi-year open-source ecosystem for neural machine translation (NMT) and natural language generation (NLG) that supports research into model architectures, feature representations, and source modalities, while maintaining API stability and competitive performance for production usages.
Scaling Up Models and Data with t5x and seqio
TLDR
Two software libraries are presented: t5x simplifies the process of building and training large language models at scale while maintaining ease of use, and seqio provides a task-based API for simple creation of fast and reproducible training data and evaluation pipelines.
YANMTT: Yet Another Neural Machine Translation Toolkit
TLDR
YANMTT aims to address issues via the minimum amount of code to pre-train large scale NMT models, selectively transfer pre-trained parameters and fine-tune them, perform translation as well as extract representations and attentions for visualization and analyses.
Efficient Inference for Multilingual Neural Machine Translation
TLDR
This work considers several ways to make multilingual NMT faster at inference without degrading its quality, and demonstrates that combining a shallow decoder with vocabulary filtering leads to almost 2 times faster inference with no loss in translation quality.
Fairseq S2T: Fast Speech-to-Text Modeling with Fairseq
TLDR
State-of-the-art RNN-based as well as Transformer-based models and open-source detailed training recipes are implemented and seamlessly integrated into S2T workflows for multi-task learning or transfer learning.
On the Sparsity of Neural Machine Translation Models
TLDR
It is shown that the pruned parameters can be rejuvenated to improve the baseline model by up to +0.8 BLEU points and the rejuvenated parameters are reallocated to enhance the ability of modeling low-level lexical information.
SimpleNER Sentence Simplification System for GEM 2021
TLDR
SimpleNER is a monolingual Seq2Seq Transformer architecture that uses control tokens pre-pended to the data, allowing the model to shape the generated simplifications according to user desired attributes.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 56 REFERENCES
OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models
TLDR
The main goal of the toolkit is to allow researchers to most effectively explore different sequence-to-sequence architectures and it provides building blocks for training encoder-decoder models for neural machine translation and automatic speech recognition.
OpenNMT: Open-Source Toolkit for Neural Machine Translation
TLDR
The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements.
Sockeye: A Toolkit for Neural Machine Translation
TLDR
This paper highlights Sockeye's features and benchmark it against other NMT toolkits on two language arcs from the 2017 Conference on Machine Translation (WMT): English-German and Latvian-English, and reports competitive BLEU scores across all three architectures.
An Analysis of Neural Language Modeling at Multiple Scales
TLDR
This work takes existing state-of-the-art word level language models based on LSTMs and QRNNs and extend them to both larger vocabularies as well as character-level granularity, achieving state- of- the-art results on character- level and word-level datasets.
ParlAI: A Dialog Research Software Platform
TLDR
ParlAI (pronounced “par-lay”), an open-source software platform for dialog research implemented in Python, is introduced, to provide a unified framework for sharing, training and testing dialog models; integration of Amazon Mechanical Turk for data collection, human evaluation, and online/reinforcement learning.
Language Modeling with Gated Convolutional Networks
TLDR
A finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens, is developed and is the first time a non-recurrent approach is competitive with strong recurrent models on these large scale language tasks.
Scaling Neural Machine Translation
TLDR
This paper shows that reduced precision and large batch training can speedup training by nearly 5x on a single 8-GPU machine with careful tuning and implementation.
Exploring the Limits of Language Modeling
TLDR
This work explores recent advances in Recurrent Neural Networks for large scale Language Modeling, and extends current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.
Adaptive Input Representations for Neural Language Modeling
TLDR
Adapt input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity are introduced and a systematic comparison of popular choices for a self-attentional architecture is performed.
Marian: Fast Neural Machine Translation in C++
TLDR
Marian is an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs that can achieve high training and translation speed.
...
1
2
3
4
5
...