Corpus ID: 663293

One Model To Learn Them All

@article{Kaiser2017OneMT,
  title={One Model To Learn Them All},
  author={Lukasz Kaiser and Aidan N. Gomez and Noam M. Shazeer and Ashish Vaswani and Niki Parmar and Llion Jones and Jakob Uszkoreit},
  journal={ArXiv},
  year={2017},
  volume={abs/1706.05137}
}
Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition… Expand
207 Citations
MONOLITHIC TASK FORMULATIONS IN NEURAL NETWORKS
From Feature To Paradigm: Deep Learning In Machine Translation
Multi-Task Learning with Deep Neural Networks: A Survey
Meta-Learning for Data and Processing Efficiency
A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
Improving label efficiency through multitask learning on auditory data
  • 2018
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
  • Atish Agarwala, Abhimanyu Das, +4 authors Qiuyi Zhang
  • Computer Science, Mathematics
  • ArXiv
  • 2021
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 31 REFERENCES
Sequence to Sequence Learning with Neural Networks
Attention is All you Need
Multi-task learning in deep neural networks for improved phoneme recognition
  • M. Seltzer, J. Droppo
  • Computer Science
  • 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013
A unified architecture for natural language processing: deep neural networks with multitask learning
Can Active Memory Replace Attention?
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Multimodal Deep Learning
ImageNet classification with deep convolutional neural networks
Neural Machine Translation in Linear Time
...
1
2
3
4
...