Computer Vision (CSE 490CV/EE 400B), Winter 2002

Project 3:  Single View Modeling

Assigned:  Wednesday, Feb 27, 2002
Due:  Tuesday, Mar 12, 2002 (by 11:59pm)

In this assignment you and your partner will create 3D texture-mapped models from a single image using the single view modeling method discussed in class.  You may find the following resources useful:

Note that all of the above describe slightly different methods that you can use to compute the same information.  Choose the one that you find the most natural and useful.

The steps of the project are:

  1. Image acquisition
  2. Calculate vanishing points
  3. Choose reference points
  4. Compute textures and 3-D positions and create a VRML model
  5. Submit results

Image Acquisition

For this assignment you should take high resolution (preferably at least 800x800) images or scans of at least two different scenes. One of your images should be a sketch or painting. For instance, a photo of a Greek temple and a painting of Leonardo da Vinci's "The Last Supper" might be interesting choices. (We don't want everyone in the class to do these objects, however.) Note also that the object you digitize need not be monumental, or be a building exterior. An office interior or desk is also a possibility. At the other extreme, aerial photographs of a section of a city could also be good source material (you might have more occlusion in this case, necessitating some manual fabrication of textures for occluded surfaces). Be sure to choose images that accurately model perspective projection without radial distortions. You'll want to choose images that are complex enough to create an interesting model with at least ten textured polygons, yet not so complex that the resulting model is hard to digitize or approximate.

Calculating Vanishing Points

Choose a scene coordinate frame by defining lines in the scene that are parallel to the X, Y, and Z axis. For each axis, specify more than two lines parallel to that axis. The intersection of these lines in the image defines the corresponding vanishing point. A vanishing point may be "at infinity". Since the accuracy of your model depends on the precision of the vanishing points, implement a robust technique for computing vanishing points that uses more than two lines. Here is a write-up for a recommended method that extends the cross-product method discussed in class to return the best intersection point of 3 or more lines in a least squared sense, and helper code for eigen-decompositing symmetric matrices with example uses. 

To compute vanishing points, choose line segments that are as long as possible and far apart in the image. Use high resolution images, and use the zoom feature to specify line endpoints with sub-pixel accuracy. A small number of "good" lines is generally better than many inaccurate lines. Use the "save" feature in your program so that you don't have to recalculate vanishing points every time you load the same image.

Choose Reference Points

You will need to set the reference points as described in lecture and in the write-ups. One way of doing this is to measure, in 3-D, when you shoot the picture, the positions of 4 points on the reference plane and one point off of that plane. The 4 reference plane points and their image projections define a 3x3 matrix H that maps u-v points to X-Y positions on the plane. The fifth point determines the reference height R off of the plane, as described in lecture. Alternatively, you can specify H and R without physical measurement by identifying a regular structure such as a cube and choosing its dimensions to be unit lengths. This latter approach is necessary for paintings and other scenes in which physical measurements are not feasible.

Compute 3D Positions

There are two different approaches for computing distances: in-plane measurements and out-of-plane measurements. You can combine these techniques to increase the power of the technique. For instance, once you have computed the height of one point X off of the reference plane P, you can compute the coordinates of any other point on the plane through X that is parallel to P (see the man on box slide from lecture). By choosing more than one reference plane, you can make even more measurements. Be creative and describe what you did to make measurements in your web page.

Compute Texture Maps

Use the points you have measured to define several planar patches in the scene. Note that even though your measurements may be in horizontal or vertical directions, you can include planes that are slanted, such as a roof.

The last step is to compute texture maps for each of these patches. If the patch is a rectangle in the scene, e.g., a wall or door, all that is needed is to warp the quadrilateral image region into a rectangular texture image. You can use the technique described in class to identify the best homography warp between the original and texture image, using the constraints that the four points of the quadrilateral map to the corners of the texture image.  We recommend using inverse warping of the pixels in the texture image into pixels in the original image, and bilinear interpolation to evaluate fractional pixel values in the original image. It is best to choose the width and height of the texture image to be about the same as that of the original quadrilateral, to avoid loss of resolution.  You may either write your own inverse warping code, or modify the warping code from project 2 to compute homographies instead of cylindrical projections.

If the patch is a non-rectangular region such as the outline of a person, you will need to perform the following steps: (1) define a quadrilateral in the image containing the region you want, (2) warp this into a rectangular texture image, as before, and (3) edit the texture image and mark out "transparent" pixels using your project 1 code or other image editing software.

Create a VRML model

For each image you work from, create a VRML model (see documentation below) with at least 10 texture-mapped polygonal faces. The skeleton code will create the VRML file for you but you need to add texture map images and masks for each polygon, in .gif or .jpg format.

Submit Results

Put your code and executable in the project 3 turnin directory, and your images and VRML models in the artifact directory with a web page project3.htm that contains:

The project was used in Image based Modeling and Rendering class I taught at Carnegie Mellon University, with Paul Heckbert. Click here for some of the best results. 

Skeleton Code

We provide a skeleton code for you to start. The skeleton code provide an interface and several basic data structure for you to work with. We hope the skeleton code will save some labor for you. However, you don't have to use our skeleton code. 

Bells and Whistles

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions. We're always interested in seeing new, unanticipated ways to use this program!

[whistle]Show the camera position in each VRML file, marked by a sphere or other shape.  We discussed how to obtain the height of the camera in lecture.  The X position can be obtained the exact same way, using the vanishing line between the Y and Z vanishing points and a reference length parallel to the X axis (and similarly for the Y position).

[bell] Merging models from multiple images. For instance, create a complete model of a building exterior from a few photographs that capture all four sides..

[bell][bell] Extend the method to create a 3D model from a cylindrical panorama.  Hint:  parallel lines in a panorama sweep out a curved path--you need to determine what this curve is.


Resources


Last modified March 05, 2002