• Publications
  • Influence
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
TLDR
This paper develops a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable and shows improvement over the state-of-the-art with a large margin.
Deep learning for detecting robotic grasps
TLDR
This work presents a two-step cascaded system with two deep networks, where the top detections from the first are re-evaluated by the second, and shows that this method improves performance on an RGBD robotic grasping dataset, and can be used to successfully execute grasps on two different robotic platforms.
Learning human activities and object affordances from RGB-D videos
TLDR
This work considers the problem of extracting a descriptive labeling of the sequence of sub-activities being performed by a human, and more importantly, of their interactions with the objects in the form of associated affordances, and formulate the learning problem using a structural support vector machine (SSVM) approach.
Learning Depth from Single Monocular Images
TLDR
This work begins by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps, and applies supervised learning to predict the depthmap as a function of the image.
Unstructured human activity detection from RGBD images
TLDR
This paper uses a RGBD sensor as the input sensor, and compute a set of features based on human pose and motion, as well as based on image and point-cloud information, based on a hierarchical maximum entropy Markov model (MEMM).
Efficient grasping from RGBD images: Learning using a new rectangle representation
TLDR
This work proposes a new ‘grasping rectangle’ representation: an oriented rectangle in the image plane that takes into account the location, the orientation as well as the gripper opening width and shows that this algorithm is successfully used to pick up a variety of novel objects.
Human Activity Detection from RGBD Images
TLDR
This paper uses a RGBD sensor as the input sensor, and presents learning algorithms to infer the activities of a person based on a hierarchical maximum entropy Markov model (MEMM), and infers the two-layered graph structure using a dynamic programming approach.
Robotic Grasping of Novel Objects using Vision
TLDR
This work considers the problem of grasping novel objects, specifically objects that are being seen for the first time through vision, and presents a learning algorithm that neither requires nor tries to build a 3-d model of the object.
Car that Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models
TLDR
This work proposes an Autoregressive Input-Output HMM to model the contextual information alongwith the maneuvers in driving maneuvers and shows that it can anticipate maneuvers 3.5 seconds before they occur with over 80% F1-score in real-time.
3-D Depth Reconstruction from a Single Still Image
TLDR
This work proposes a model that incorporates both monocular cues and stereo (triangulation) cues, to obtain significantly more accurate depth estimates than is possible using either monocular or stereo cues alone.
...
1
2
3
4
5
...