Recovering Surface Layout from an Image

@article{Hoiem2006RecoveringSL,
  title={Recovering Surface Layout from an Image},
  author={Derek Hoiem and Alexei A. Efros and Martial Hebert},
  journal={International Journal of Computer Vision},
  year={2006},
  volume={75},
  pages={151-172}
}
Humans have an amazing ability to instantly grasp the overall 3D structure of a scene—ground orientation, relative positions of major landmarks, etc.—even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this “surface layout” of a scene should be of great assistance for many tasks, including recognition, navigation… 
Seeing the world behind the image: Spatial layout for 3D scene understanding
TLDR
This dissertation proposes methods to recover the basic spatial layout from a single image and begins to investigate its use as a foundation for scene understanding, and demonstrates the importance of robustness through a wide variety of image cues, multiple segmentations, and a general strategy of soft decisions and gradual inference of image structure.
3d spatial layout and geometric constraints for scene understanding
TLDR
This work builds representations and proposes strategies for exploiting constraints towards extracting a 3D understanding of a scene from its single image, and shows how to use the 3D spatial layout models together with object cuboid models to predict the free space in the scene.
Dominant plane recognition in interior scenes from a single image
TLDR
This work presents initial results of a novel methodology for dominant planes recognition in a single image by combining three key strategies: a learning algorithm, a segmentation scheme and a contour detection method.
3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
  • Scott Satkin, M. Hebert
  • Mathematics, Computer Science
    2013 IEEE International Conference on Computer Vision
  • 2013
TLDR
This work describes the 3DNN algorithm and rigorously evaluates its performance for the tasks of geometry estimation and object detection/segmentation, developing a scene matching approach which is truly 100% viewpoint invariant, yielding state-of-the-art performance on challenging data.
Understanding the 3D layout of a cluttered room from multiple images
TLDR
This work presents a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints and shows an augmented reality mobile application to highlight the high accuracy of the method.
Interpreting the structure of single images by learning from examples
TLDR
This work developed a plane detection algorithm, which is able to find planar surfaces in a single still image and estimate their orientation with respect to the camera, and demonstrated an application of this to visual odometry, where single-image plane detection allows structure-rich maps to be built quickly.
Recovering Occlusion Boundaries from a Single Image
TLDR
The goal is to recover the occlusion boundaries and depth ordering of free-standing structures in the scene using the traditional edge and region cues together with 3D surface and depth cues.
Coherent Scene Understanding With 3D Geometric Reasoning
TLDR
This dissertation proposes methods that produce a geometrically and semantically coherent 3D interpretation of urban scenes from a single image, and shows the benefits of reasoning in 3D when analyzing 2D images.
Geometric context from a single image
TLDR
This work shows that it can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes, and provides a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label.
Qualitative 3D Surface Reconstruction from Images
Prior to the advent of appearance-based recognition in the early 1990’s, object categorization researchers modeled the prototypical shape of an object, seeking models that were invariant to changes
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 72 REFERENCES
Geometric context from a single image
TLDR
This work shows that it can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes, and provides a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label.
Bayesian reconstruction of 3D shapes and scenes from a single image
  • Feng Han, Song-Chun Zhu
  • Mathematics, Computer Science
    First IEEE International Workshop on Higher-Level Knowledge in 3D Modeling and Motion Analysis, 2003. HLK 2003.
  • 2003
TLDR
This work represents prior knowledge of 3D shapes and scenes by probabilistic models at two levels - both are defined on graphs, and it assumes that objects should be supported for maximum stability with global bounding surfaces, such as ground, sky and walls.
Depth from Familiar Objects: A Hierarchical Model for 3D Scenes
TLDR
An integrated, probabilistic model for the appearance and three-dimensional geometry of cluttered scenes and a robust likelihood model accounts for outliers in matched stereo features, allowing effective learning of 3D object structure from partial 2D segmentations.
Automatic photo pop-up
This paper presents a fully automatic method for creating a 3D model from a single photograph. The model is made up of several texture-mapped planar billboards and has the complexity of a typical
Putting Objects in Perspective
TLDR
This paper provides a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint by allowing probabilistic object hypotheses to refine geometry and vice-versa.
COMPUTER RECOGNITION OF THREE-DIMENSIONAL OBJECTS IN A VISUAL SCENE
TLDR
The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (onThe basis of pure form); in fact, successful methods are shown.
Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons
TLDR
A unified model to construct a vocabulary of prototype tiny surface patches with associated local geometric and photometric properties, represented as a set of linear Gaussian derivative filter outputs, under different lighting and viewing conditions is provided.
Depth Estimation from Image Structure
TLDR
It is demonstrated that, by recognizing the properties of the structures present in the image, one can infer the scale of the scene and, therefore, its absolute mean depth.
A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image
  • E. Delage, Honglak Lee, A. Ng
  • Computer Science
    2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
TLDR
This paper presents a dynamic Bayesian network model capable of resolving some of the ambiguities of monocular vision and recovering 3d information for many images and shows that this model can be used for 3d reconstruction from a single image.
RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES
We suggest that an appropriate role of early visual processing is to describe a scene in terms of intrinsic (vertical) characteristics -- such as range, orientation, reflectance, and incident
...
1
2
3
4
5
...