• Publications
  • Influence
RAVEN: A Dataset for Relational and Analogical Visual REasoNing
This work proposes a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation and establishes a semantic link between vision and reasoning by providing structure representation.
Learning Perceptual Inference by Contrasting
It is demonstrated that CoPINet sets the new state-of-the-art for permutation-invariant models on two major datasets and concludes that spatial-temporal reasoning depends on envisaging the possibilities consistent with the relations between objects and can be solved from pixel-level inputs.
A moving least squares material point method with displacement discontinuity and two-way rigid body coupling
In this paper, we introduce the Moving Least Squares Material Point Method (MLS-MPM). MLS-MPM naturally leads to the formulation of Affine Particle-In-Cell (APIC) [Jiang et al. 2015] and Polynomial
Human-Centric Indoor Scene Synthesis Using Stochastic Grammar
We present a human-centric method to sample and synthesize 3D room layouts and 2D images thereof, to obtain large-scale 2D/3D image data with the perfect per-pixel ground truth. An attributed spatial
Classification of Lung Nodule Malignancy Risk on Computed Tomography Images Using Convolutional Neural Network: A Comparison Between 2D and 3D Strategies
Computed tomography (CT) is the preferred method for non-invasive lung cancer screening. Early detection of potentially malignant lung nodules will greatly improve patient outcome, where an effective
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation
An end-to-end model that simultaneously solves all three tasks in real-time given only a single RGB image and significantly outperforms prior approaches on 3D object detection, 3D layout estimation,3D camera pose estimation, and holistic scene understanding is proposed.
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
A Holistic Scene Grammar (HSG) is introduced to represent the 3D scene structure, which characterizes a joint distribution over the functional and geometric space of indoor scenes, and significantly outperforms prior methods on 3D layout estimation, 3D object detection, and holistic scene understanding.
Dr. Android and Mr. Hide: Fine-grained security policies on unmodified Android
Google’s Android platform includes a permission model that protects access to sensitive capabilities, such as Internet access, GPS use, and telephony. We have found that Android’s current permissions
Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense
We propose a new 3D holistic++ scene understanding problem, which jointly tackles two tasks from a single-view image: (i) holistic scene parsing and reconstruction---3D estimations of object bounding
A tale of two explanations: Enhancing human trust by explaining robot behavior
A psychological experiment to examine what forms of explanations best foster human trust in the robot found that comprehensive and real-time visualizations of the robot’s internal decisions were more effective in promoting human trust than explanations based on summary text descriptions.