• Computer Science, Mathematics
  • Published in ArXiv 2017

One Model To Learn Them All

@article{Kaiser2017OneMT,
  title={One Model To Learn Them All},
  author={Lukasz Kaiser and Aidan N. Gomez and Noam Shazeer and Ashish Vaswani and Niki Parmar and Llion Jones and Jakob Uszkoreit},
  journal={ArXiv},
  year={2017},
  volume={abs/1706.05137}
}
Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 123 CITATIONS

End-to-end speech translation system with attention-based mechanisms

VIEW 6 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Multimodal speech synthesis architecture for unsupervised speaker adaptation

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Pairwise Cluster Similarity Domain Adaptation for Multimodal Deep Learning Architecture

VIEW 3 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Gaze Estimation and Interaction in Real-World Environments

VIEW 3 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Measuring the Intrinsic Dimension of Objective Landscapes

VIEW 4 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling

VIEW 2 EXCERPTS
CITES BACKGROUND

FILTER CITATIONS BY YEAR

2017
2020

CITATION STATISTICS

  • 7 Highly Influenced Citations

  • Averaged 40 Citations per year from 2017 through 2019

References

Publications referenced by this paper.
SHOWING 1-10 OF 28 REFERENCES

Xception: Deep Learning with Depthwise Separable Convolutions

  • François Chollet
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
VIEW 7 EXCERPTS
HIGHLY INFLUENTIAL

Attention is All you Need

VIEW 1 EXCERPT

Can Active Memory Replace Attention?

VIEW 3 EXCERPTS

Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

VIEW 1 EXCERPT