Learning to Reweight Examples for Robust Deep Learning
- Mengye Ren, Wenyuan Zeng, Binh Yang, R. Urtasun
- Computer ScienceInternational Conference on Machine Learning
- 24 March 2018
This work proposes a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions that can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.
Meta-Learning for Semi-Supervised Few-Shot Classification
- Mengye Ren, Eleni Triantafillou, R. Zemel
- Computer ScienceInternational Conference on Learning…
- 15 February 2018
This work proposes novel extensions of Prototypical Networks that are augmented with the ability to use unlabeled examples when producing prototypes, and confirms that these models can learn to improve their predictions due to unlabeling examples, much like a semi-supervised algorithm would.
Exploring Models and Data for Image Question Answering
- Mengye Ren, Ryan Kiros, R. Zemel
- Computer ScienceNIPS
- 8 May 2015
This work proposes to use neural networks and visual semantic embeddings, without intermediate stages such as object detection and image segmentation, to predict answers to simple questions about images, and presents a question generation algorithm that converts image descriptions into QA form.
The Reversible Residual Network: Backpropagation Without Storing Activations
- Aidan N. Gomez, Mengye Ren, R. Urtasun, R. Grosse
- Computer ScienceNIPS
- 14 July 2017
The Reversible Residual Network (RevNet) is presented, a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's, therefore, the activations for most layers need not be stored in memory during backpropagation.
Graph HyperNetworks for Neural Architecture Search
- Chris Zhang, Mengye Ren, R. Urtasun
- Computer ScienceInternational Conference on Learning…
- 27 September 2018
The GHN is proposed to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network, which can predict network performance more accurately than regular hypernetworks and premature early stopping.
Incremental Few-Shot Learning with Attention Attractor Networks
- Mengye Ren, Renjie Liao, Ethan Fetaya, R. Zemel
- Computer ScienceNeural Information Processing Systems
- 16 October 2018
A meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes, and it is demonstrated that the learned attractor network can help recognize novel classes while remembering old classes without the need to review the original training set.
Image Question Answering: A Visual Semantic Embedding Model and a New Dataset
- Mengye Ren, Ryan Kiros, R. Zemel
- Computer ScienceArXiv
- 8 May 2015
This work proposes to use recurrent neural networks and visual semantic embeddings without intermediate stages such as object detection and image segmentation to address the problem of imagebased question-answering (QA) with new models and datasets.
End-to-End Instance Segmentation with Recurrent Attention
- Mengye Ren, R. Zemel
- Computer ScienceComputer Vision and Pattern Recognition
- 30 May 2016
An end-to-end recurrent neural network (RNN) architecture with an attention mechanism to model a human-like counting process, and produce detailed instance segmentations is proposed.
Understanding Short-Horizon Bias in Stochastic Meta-Optimization
- Yuhuai Wu, Mengye Ren, Renjie Liao, R. Grosse
- Computer ScienceInternational Conference on Learning…
- 15 February 2018
Short-horizon bias is a fundamental problem that needs to be addressed if meta-optimization is to scale to practical neural net training regimes, and is introduced as a toy problem, a noisy quadratic cost function, on which it is analyzed.
SBNet: Sparse Blocks Network for Fast Inference
- Mengye Ren, A. Pokrovsky, Binh Yang, R. Urtasun
- Computer ScienceIEEE/CVF Conference on Computer Vision and…
- 7 January 2018
This work leverages the sparsity structure of computation masks and proposes a novel tiling-based sparse convolution algorithm that is effective on LiDAR-based 3D object detection, and reports significant wall-clock speed-ups compared to dense convolution without noticeable loss of accuracy.
...
...