• Publications
  • Influence
Deep multi-scale video prediction beyond mean square error
TLDR
This work trains a convolutional network to generate future frames given an input sequence and proposes three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function.
Learning Hierarchical Features for Scene Labeling
TLDR
A method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel, alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information.
Indoor Semantic Segmentation using depth information
TLDR
This work addresses multi-class segmentation of indoor scenes with RGB-D inputs by applying a multiscale convolutional network to learn features directly from the images and the depth information.
Semantic Segmentation using Adversarial Networks
TLDR
An adversarial training approach to train semantic segmentation models that can detect and correct higher-order inconsistencies between ground truth segmentation maps and the ones produced by the segmentation net.
Power Watershed: A Unifying Graph-Based Optimization Framework
TLDR
This work extends a common framework for graph-based image segmentation that includes the graph cuts, random walker, and shortest path optimization algorithms and proposes a new family of segmentation algorithms that fixes p to produce an optimal spanning forest but varies the power q beyond the usual watershed algorithm.
Predicting Deeper into the Future of Semantic Segmentation
TLDR
An autoregressive convolutional neural network that learns to iteratively generate multiple frames is developed and results show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames.
Predicting Future Instance Segmentations by Forecasting Convolutional Features
TLDR
A predictive model is developed in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model that significantly improves over baselines based on optical flow.
Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers
TLDR
A multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel that yields record accuracies and near-record accuracy on the Stanford Background Dataset.
GDPP: Learning Diverse Generations Using Determinantal Point Process
TLDR
The Generative DPP approach shows a consistent resistance to mode-collapse on a wide variety of synthetic data and natural image datasets while outperforming state-of-the-art methods for data-efficiency, generation quality, and convergence-time whereas being 5.8x faster than its closest competitor.
Power watersheds: A new image segmentation framework extending graph cuts, random walker and optimal spanning forest
TLDR
This work extends a common framework for seeded image segmentation that includes the graph cuts, random walker, and shortest path optimization algorithms and proposes a new family of segmentation algorithms that fixes p to produce an optimal spanning forest but varies the power q beyond the usual watershed algorithm.
...
1
2
3
4
5
...