## Representations and Techniques for 3D Object Recognition and Scene Interpretation

- Derek Hoiem, Silvio Savarese
- Representations and Techniques for 3D Object…
- 2011

- Published 1986 in FJCC

Many robot applications require using sensors to locate objects whose initial pose is constrained but not exactly known. Most techniques for object localization assume that the object’s pose is completely unknown. This paper describes a simple method for localizing known objects in a scene. We describe how an off-line computation that exploits constraints on the object’s expected pose can be used to reduce the expected time for the on-line computation to localize the object. The objects treated here are modeled as polyhedra that, in principle, can have up to six degrees of positional freedom relative to the sensors. the object. An important subgoal is that both the on-line and off-line methods should be simple enough to be easily implemented. The method described here can be applied to both twodimensional and three-dimensional sensing situations. In the two-dimensional case, objects have only three degrees of positional freedom relative to the sensor (two transiational and one rotational). In this case, the sensors (and their pre-processors) are assumed to compute edges, that is, line segments in the scene. In the three-dimensional case, objects have up to three translational and three rotational degrees of .freedom. In this case, the sensors (and their pre-processors) are assumed to compute planar patches in the scene. We do not deal with the general case in which only two-dimensional data is available but the object has more than three degrees of freedom. For the sake of brevity, we limit our discussion to the three-dimensional case; the specialization to two-dimensions is straightforward. 0. Introduction The problems of object recognition and localization have received a great deal of attention (see [Jain 86, Grimson and Lozano-Perez 84,851 for reviews of the literature). Most approaches to recognition assume that the object’s pose is entirely unconstrained. In most practical robotics applications, however, the uncertainty in part location is bounded to relatively small ranges. These constraints may come from knowledge of the feeding mechanisms or the physics of part stability. In most recognition systems, it is difficult to incorporate these type of constraints on the initial object pose. There have been a few systems where such information is readily incorporated, but the methods themselves have tended to be fairly complex [Belles 76, Brooks 81, Goad 83, Baird 85, Faugeras and Hebert 831. Nevertheless, the approach described here was significantly influenced by these previous methods, especially Goad’s excellent paper. The localization algorithm described here is quite simple as is the mechanism for incorporating any available constraints on object pose. In the absence of any global constraints, the algorithm will still work, albeit more slowly. The specific problem considered in this paper is how to locate a known object in a cluttered scene using sensors that provide dense position information. We assume that worst-case bounds on the pose of the object are available, as well as bounds on sensor measurement error. Our goal is to exploit the known bounds on object pose so as to reduce the amount of on-line computation required to localize We assume that the objects of interest can be modeled as sets of planar faces. Only the individual plane equations and dimensions of the model faces are needed. No face, edge, or vertex connectivity information is required; the model faces do not even have to be connected. Because of this, the method can be applied to curved objects that are readily approximated by planar patches. Of course, such planar approximations are not adequate for all cases, for example, objects of high curvature or multiply curved surfaces. We assume the availability of a sensor and pm-processor that can compute the planar patches present within some given rectangular sub-window of the scene. A great deal of work in computer vision has been dedicated to solving this problem of obtaining depth from two-dimensional visual data (see [Horn 861 for a representative sample). Other less computationally-intensive methods exist for obtaining the required patches, notably range sensing (see [ Jarvis 831 for a review). We will not address this problem further. In section 1, we present the basic on-line localization method. in section 2, we describe the of&line computations required for the on-line method. In section 3, we discuss the method and point out areas for further work. CH2345-7/86/0000/0138$01,00 @ 1986 IEEE 138 1. A simple on-line localization algorithm The process of localization is carried out in three steps: . The first step is to identify possible assignments of sensed data to model faces consistent with a set of measurements derived from the model. This is the crucial step. . The second step is to identify the pose of the object from each of these assignments. . The third step is to pick the solution that best matches all the available data. At this level of description, the method is similar to the inlerpretation tree method described in (Grimson and LozanoPerez 84, 851 and draws results from that earlier method. The method described in this paper differs in the first of these steps, while the earlier method does not do any hypothesis verification, the method described here goes very early into a hypothesize/verify cycle. The current method is geared to situations where the set of possible matches of data patches to model faces can be constrained a priori. The goal is to reduce the combinatorics of the matching process in the earlier methods by exploiting the global position and orientation constraints.

@inproceedings{LozanoPrez1986OffLinePF,
title={Off-Line Planning for On-Line Object Localization},
author={Tom{\'a}s Lozano-P{\'e}rez and W. Eric L. Grimson},
booktitle={FJCC},
year={1986}
}