• Publications
  • Influence
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
TLDR
Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train. Expand
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
TLDR
This paper theoretically shows that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and trains more than 12000 models covering most prominent methods and evaluation metrics on seven different data sets. Expand
Parameter-Efficient Transfer Learning for NLP
TLDR
To demonstrate adapter's effectiveness, the recently proposed BERT Transformer model is transferred to 26 diverse text classification tasks, including the GLUE benchmark, and adapter attain near state-of-the-art performance, whilst adding only a few parameters per task. Expand
Are GANs Created Equal? A Large-Scale Study
TLDR
A neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures finds that most models can reach similar scores with enough hyperparameter optimization and random restarts, suggesting that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes. Expand
Combining online and offline knowledge in UCT
TLDR
This work considers three approaches for combining offline and online value functions in the UCT algorithm, and combines these algorithms in MoGo, the world's strongest 9 x 9 Go program, where each technique significantly improves MoGo's playing strength. Expand
Big Transfer (BiT): General Visual Representation Learning
TLDR
By combining a few carefully selected components, and transferring using a simple heuristic, Big Transfer achieves strong performance on over 20 datasets and performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. Expand
Towards Accurate Generative Models of Video: A New Metric & Challenges
TLDR
A large-scale human study is contributed, which confirms that FVD correlates well with qualitative human judgment of generated videos, and provides initial benchmark results on SCV. Expand
Assessing Generative Models via Precision and Recall
TLDR
A novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions is proposed which is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models. Expand
Modification of UCT with Patterns in Monte-Carlo Go
TLDR
A Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT, is developed, and the modification of UCT for Go application is explained and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo. Expand
Monte-Carlo tree search and rapid action value estimation in computer Go
TLDR
The Monte-Carlo revolution in computer Go is surveyed, the key ideas that led to the success of MoGo and subsequent Go programs are outlined, and for the first time a comprehensive description, in theory and in practice, of this extended framework for Monte- Carlo tree search is provided. Expand
...
1
2
3
4
5
...