What’s Hidden in a Randomly Weighted Neural Network?
- Vivek Ramanujan, Mitchell Wortsman, Aniruddha Kembhavi, Ali Farhadi, Mohammad Rastegari
- Computer ScienceComputer Vision and Pattern Recognition
- 29 November 2019
It is empirically show that as randomly weighted neural networks with fixed weights grow wider and deeper, an ``untrained subnetwork" approaches a network with learned weights in accuracy.
Soft Threshold Weight Reparameterization for Learnable Sparsity
- Aditya Kusupati, Vivek Ramanujan, Ali Farhadi
- Computer ScienceInternational Conference on Machine Learning
- 8 February 2020
STR is a simple mechanism which learns effective sparsity budgets that contrast with popular heuristics that boosts the accuracy over existing results by up to 10% in the ultra sparse (99%) regime and can also be used to induce low-rank (structured sparsity) in RNNs.
Supermasks in Superposition
- Mitchell Wortsman, Vivek Ramanujan, Ali Farhadi
- Computer ScienceNeural Information Processing Systems
- 26 June 2020
The Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting, is presented and it is found that a single gradient step is often sufficient to identify the correct mask, even among 2500 tasks.
Improving Shape Deformation in Unsupervised Image-to-Image Translation
- Aaron Gokaslan, Vivek Ramanujan, Daniel Ritchie, K. Kim, J. Tompkin
- Computer ScienceEuropean Conference on Computer Vision
- 13 August 2018
This work introduces a discriminator with dilated convolutions that is able to use information from across the entire image to train a more context-aware generator, coupled with a multi-scale perceptual loss that is better able to represent error in the underlying shape of objects.
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
- Vivek Ramanujan, Pavan Kumar Anasosalu Vasu, Ali Farhadi, Oncel Tuzel, H. Pouransari
- Computer ScienceComputer Vision and Pattern Recognition
- 6 December 2021
This work proposes a new learning paradigm for representation learning: forward compatible training (FCT), and proposes learning side- information, an auxiliary feature for each sample whichitates future updates of the model.
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
- Aditya Kusupati, Matthew Wallingford, Ali Farhadi
- Computer ScienceNeural Information Processing Systems
- 2 June 2021
This work proposes a novel method for Learning Low-dimensional binary Codes (LLC) for instances as well as classes that is super-efficient while still ensuring nearly optimal classification accuracy for ResNet50 on ImageNet-1K and captures intrinsically important features in the data by discovering an intuitive taxonomy over classes.
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent
- William Cooper Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, Noah Smith
- Computer ScienceConference on Empirical Methods in Natural…
- 19 October 2020
The tendency for transformer parameters to grow in magnitude during training is studied, and its implications for the emergent representations within self attention layers are studied, suggesting saturation is a new characterization of an inductive bias implicit in GD of particular interest for NLP.
Parameter Norm Growth During Training of Transformers
- William Cooper Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, Noah A. Smith
- Computer ScienceArXiv
- 19 October 2020
The tendency of transformer parameters to grow in magnitude during training is studied to find that in certain contexts, GD increases the parameter $L_2$ norm up to a threshold that itself increases with training-set accuracy, which means increasing training accuracy over time enables the norm to increase.
Matryoshka Representations for Adaptive Deployment
- Aditya Kusupati, Gantavya Bhatt, Ali Farhadi
- Computer ScienceArXiv
- 2022
This work introduces Matryoshka Representation Learning (MRL), a minimally modifies existing representation learning pipelines and imposes no additional cost during inference and deployment that learns coarse-to-fine representations that are at least as accurate and rich as independently trained low-dimensional representations.
Forward Compatible Training for Representation Learning
- Vivek Ramanujan, Pavan Kumar Anasosalu Vasu, Ali Farhadi, Oncel Tuzel, H. Pouransari
- Computer ScienceArXiv
- 2021
This work proposes a new learning paradigm for representation learning: forward compatible training (FCT), and proposes learning sideinformation, an auxiliary feature for each sample which facilitates future updates of the model.
...
...