ELLIPSDF: Joint Object Pose and Shape Optimization with a Bi-level Ellipsoid and Signed Distance Function Description

  title={ELLIPSDF: Joint Object Pose and Shape Optimization with a Bi-level Ellipsoid and Signed Distance Function Description},
  author={Mo Shan and Qiaojun Feng and You-Yi Jau and Nikolay A. Atanasov},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
Autonomous systems need to understand the semantics and geometry of their surroundings in order to comprehend and safely execute object-level task specifications. This paper proposes an expressive yet compact model for joint object pose and shape optimization, and an associated optimization algorithm to infer an object-level map from multi-view RGB-D camera observations. The model is expressive because it captures the identities, positions, orientations, and shapes of objects in the environment… 

RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers

A transformer-based neural network architecture for multi-object 3D reconstruction from RGB videos that is single stage, end-to-end trainable, and it can reason holistically about a scene from multiple video frames without needing a brittle tracking step.

TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function

This work proposes TaylorImNet inspired by the Taylor series for implicit 3D shape representation that exploits a set of discrete expansion points and corresponding Taylor series to model a contiguous implicit shape field and can achieve a significantly faster generation speed than other baselines.

Generative Category-Level Shape and Pose Estimation with Semantic Primitives

  • Guanglin LiYifeng Li Guofeng Zhang
  • Computer Science
  • 2022
: Empowering autonomous agents with 3D understanding for daily objects is a grand challenge in robotics applications. When exploring in an un-known environment, existing methods for object pose



FroDO: From Detections to 3D Objects

FroDO is a method for accurate 3D reconstruction of object instances from RGB video that infers their location, pose and shape in a coarse to fine manner to embed object shapes in a novel learnt shape space that allows seamless switching between sparse point cloud and dense DeepSDF decoding.

CubeSLAM: Monocular 3-D Object SLAM

The SLAM method achieves the state-of-the-art monocular camera pose estimation and at the same time, improves the 3-D object detection accuracy.

Fusion++: Volumetric Object-Level SLAM

An online object-level SLAM system which builds a persistent and accurate 3D graph map of arbitrary reconstructed objects is proposed, and performance evaluation shows the approach is highly memory efficient and runs online at 4-8Hz despite not being optimised at the software level.

Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos

The core idea of the method is to integrate neural network predictions from individual frames with a temporally global, multi-view constraint optimization formulation, which resolves the scale and depth ambiguities in the per-frame predictions, and generally improves the estimate of all pose parameters.

MOLTR: Multiple Object Localization, Tracking and Reconstruction From Monocular RGB Videos

MOLTR is presented, a solution to object-centric mapping using only monocular image sequences and camera poses that is able to localize, track and reconstruct multiple rigid objects in an online fashion when a RGB camera captures a video of the surrounding.

NodeSLAM: Neural Object Descriptors for Multi-View Shape Reconstruction

This framework allows for accurate and robust 3D object reconstruction which enables multiple applications including robot grasping and placing, augmented reality, and the first object-level SLAM system capable of optimising object poses and shapes jointly with camera trajectory.

Deep-SLAM++: Object-level RGBD SLAM based on class-specific deep shape priors

This work proposes a discrete selection strategy that finds the best among multiple proposals from different registered views by re-enforcing the agreement with the online depth measurements, and produces an effective object-level RGBD SLAM system that produces compact, high-fidelity, and dense 3D maps with semantic annotations.

QuadricSLAM: Dual Quadrics From Object Detections as Landmarks in Object-Oriented SLAM

A sensor model for object detectors is developed that addresses the challenge of partially visible objects, and it is demonstrated how to jointly estimate the camera pose and constrained dual quadric parameters in factor graph based SLAM with a general perspective camera.

DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

This work introduces DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data.

DeepSDF x Sim(3): Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation

This work presents a formulation of neural network based generative models for 3D shapes based on signed distance functions that overcomes the need to have query shapes in the same canonical scale and pose as those observed during training, restricting its effectiveness to real world scenes.