SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- Vijay Badrinarayanan, Alex Kendall, R. Cipolla
- Computer ScienceIEEE Transactions on Pattern Analysis and Machineā¦
- 2 November 2015
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling
- Vijay Badrinarayanan, Ankur Handa, R. Cipolla
- Computer ScienceComputer Vision and Pattern Recognition
- 27 May 2015
The results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
- Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, Andrew Rabinovich
- Computer ScienceInternational Conference on Machine Learning
- 7 November 2017
A gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes is presented, showing that for various network architectures, for both regression and classification tasks, and on both synthetic and real datasets, GradNorm improves accuracy and reduces overfitting across multiple tasks.
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding
- Alex Kendall, Vijay Badrinarayanan, R. Cipolla
- Computer ScienceBritish Machine Vision Conference
- 9 November 2015
A practical system which is able to predict pixel-wise class labels with a measure of model uncertainty, and shows that modelling uncertainty improves segmentation performance by 2-3% across a number of state of the art architectures such as SegNet, FCN and Dilation Network, with no additional parametrisation.
Atlas: End-to-End 3D Scene Reconstruction from Posed Images
- Zak Murez, Tarrence van As, James Bartolozzi, Ayan Sinha, Vijay Badrinarayanan, Andrew Rabinovich
- Computer ScienceEuropean Conference on Computer Vision
- 23 March 2020
An end-to-end 3D reconstruction method for a scene by directly regressing a truncated signed distance function (TSDF) from a set of posed RGB images is presented and semantic segmentation of the 3D model is obtained without significant computation.
FIERY: Future Instance Prediction in Birdās-Eye View from Surround Monocular Cameras
- Anthony Hu, Zak Murez, Alex Kendall
- Computer ScienceIEEE International Conference on Computer Vision
- 21 April 2021
FIERY is a probabilistic future prediction model in birdās-eye view from monocular cameras that predicts future instance segmentation and motion of dynamic agents that can be transformed into non-parametric future trajectories.
Understanding RealWorld Indoor Scenes with Synthetic Data
- Ankur Handa, Viorica Patraucean, Vijay Badrinarayanan, Simon Stent, R. Cipolla
- Computer ScienceComputer Vision and Pattern Recognition
- 22 November 2015
This work focuses its attention on depth based semantic per-pixel labelling as a scene understanding problem and shows the potential of computer graphics to generate virtually unlimited labelled data from synthetic 3D scenes.
RoomNet: End-to-End Room Layout Estimation
- Chen-Yu Lee, Vijay Badrinarayanan, Tomasz Malisiewicz, Andrew Rabinovich
- Computer ScienceIEEE International Conference on Computer Vision
- 18 March 2017
This paper predicts the locations of the room layout keypoints using RoomNet, an end-to-end trainable encoder-decoder network and presents optional extensions to the RoomNet architecture such as including recurrent computations and memory units to refine the keypoint locations under the same parametric capacity.
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data
- Ankur Handa, Viorica Patraucean, Vijay Badrinarayanan, Simon Stent, R. Cipolla
- Computer ScienceArXiv
- 22 November 2015
This work focuses its attention on depth based semantic per-pixel labelling as a scene understanding problem and shows the potential of computer graphics to generate virtually unlimited labelled data from synthetic 3D scenes by carefully synthesizing training data with appropriate noise models.
Label propagation in video sequences
- Vijay Badrinarayanan, Fabio Galasso, R. Cipolla
- Computer ScienceIEEE Computer Society Conference on Computerā¦
- 1 June 2010
This paper proposes a probabilistic graphical model for the problem of propagating labels in video sequences, also termed the label propagation problem, and reports studies on a state of the art Random forest classifier based video segmentation scheme, trained using fully ground truth data and with data obtained from label propagation.
...
...