Charles Ruizhongtai Qi

Learn More
Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs(More)
3D shape models are becoming widely available and easier to capture, making available 3D information crucial for progress in object classification. Current state-of-theart methods rely on CNNs to address this problem. Recently, we witness two types of CNNs being developed: CNNs based upon volumetric representations versus CNNs based upon multi-view(More)
Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well(More)
Both 3D models and 2D images contain a wealth of information about everyday objects in our environment. However, it is difficult to semantically link together these two media forms, even when they feature identical or very similar objects. We propose a <i>joint</i> embedding space populated by both 3D shapes and 2D images of objects, where the distances(More)
We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution – but complete – output. To this end, we introduce a 3D-EncoderPredictor Network (3D-EPN) which is composed of 3D convolutional(More)
Building discriminative representations for 3D data has been an important task in computer graphics and computer vision research. Convolutional Neural Networks (CNNs) have shown to operate on 2D images with great success for a variety of tasks. Lifting convolution operators to 3D (3DCNNs) seems like a plausible and promising next step. Unfortunately, the(More)
Few prior works study deep learning on point sets. PointNet [20] is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network(More)
Training for Our Volumetric CNNs To produce occupancy grids from meshes, the faces of a mesh are subdivided until the length of the longest edge is within a single voxel; then all voxels that intersect with a face are marked as occupied. For 3D resolution 10,30 and 60 we generate voxelizations with central regions 10, 24, 54 and padding 0, 3, 3(More)
In Sec B we extend the robustness test to compare PointNet with VoxNet on incomplete input. In Sec C we provide more details on neural network architectures, training parameters and in Sec D we describe our detection pipeline in scenes. Then Sec E illustrates more applications of PointNet, while Sec F shows more analysis experiments. Sec G provides a proof(More)
This paper describes our efforts to include a hands-on component in the teaching of core concepts of digital signal processing. The basis of our approach was the low-cost and open-source &#x201C;Stanford Lab in a Box.&#x201D; This system, with its easy to use Arduino-like programming interface allowed students to see how fundamental DSP concepts such as(More)