LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
- F. Yu, Yinda Zhang, Shuran Song, Ari Seff, Jianxiong Xiao
- Computer ScienceArXiv
- 10 June 2015
This work proposes to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop, and constructs a new image dataset, LSUN, which contains around one million labeled images for each of 10 scene categories and 20 object categories.
Matterport3D: Learning from RGB-D Data in Indoor Environments
- Angel X. Chang, Angela Dai, Yinda Zhang
- Computer ScienceInternational Conference on 3D Vision
- 18 September 2017
Matterport3D is introduced, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400RGB-D images of 90 building-scale scenes that enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
- Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, W. Liu, Yu-Gang Jiang
- Computer ScienceEuropean Conference on Computer Vision
- 5 April 2018
An end-to-end deep learning architecture that produces a 3D shape in triangular mesh from a single color image by progressively deforming an ellipsoid, leveraging perceptual features extracted from the input image.
DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image
- Jiaxiong Qiu, Zhaopeng Cui, M. Pollefeys
- Computer ScienceComputer Vision and Pattern Recognition
- 2 December 2018
A deep learning architecture that produces accurate dense depth for the outdoor scene from a single color image and a sparse depth, which improves upon the state-of-the-art performance on KITTI depth completion benchmark.
PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding
- Yinda Zhang, Shuran Song, P. Tan, Jianxiong Xiao
- Computer ScienceEuropean Conference on Computer Vision
- 6 September 2014
Experiments show that solely based on 3D context without any image region category classifier, the proposed whole-room context model can achieve a comparable performance with the state-of-the-art object detector, demonstrating that when the FOV is large, context is as powerful as object appearance.
Deep Depth Completion of a Single RGB-D Image
- Yinda Zhang, T. Funkhouser
- Computer ScienceIEEE/CVF Conference on Computer Vision and…
- 25 March 2018
A deep network is trained that takes an RGB image as input and predicts dense surface normals and occlusion boundaries, then combined with raw depth observations provided by the RGB-D camera to solve for depths for all pixels, including those missing in the original observation.
Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks
- Yinda Zhang, Shuran Song, T. Funkhouser
- Computer ScienceComputer Vision and Pattern Recognition
- 22 December 2016
This work introduces a large-scale synthetic dataset with 500K physically-based rendered images from 45K realistic 3D indoor scenes and shows that pretraining with this new synthetic dataset can improve results beyond the current state of the art on all three computer vision tasks.
Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation
- Chao Wen, Yinda Zhang, Zhuwen Li, Yanwei Fu
- Computer ScienceIEEE International Conference on Computer Vision
- 5 August 2019
This model learns to predict series of deformations to improve a coarse shape iteratively and exhibits generalization capability across different semantic categories, number of input images, and quality of mesh initialization.
TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking
- Pingmei Xu, Krista A. Ehinger, Yinda Zhang, A. Finkelstein, Sanjeev R. Kulkarni, Jianxiong Xiao
- Computer ScienceArXiv
- 25 April 2015
This paper introduces a webcam-based gaze tracking system that supports large-scale, crowdsourced eye tracking deployed on Amazon Mechanical Turk (AMTurk), and builds a saliency dataset for a large number of natural images.
DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing
- Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, M. Pollefeys, Zhaopeng Cui
- Computer ScienceComputer Vision and Pattern Recognition
- 29 November 2019
This work proposes a differentiable sphere tracing algorithm that can effectively reconstruct accurate 3D shapes from various inputs, such as sparse depth and multi-view images, through inverse optimization and shows excellent generalization capability and robustness against various noises.
...
...