CSE 490CV Project 3

Computer Vision (CSE 490CV/EE 400B), Winter 2002

Project 3: Single View Modeling

Assigned: Wednesday, Feb 27, 2002
Due: Tuesday, Mar 12, 2002 (by 11:59pm)

In this assignment you and your partner will create 3D texture-mapped models from a single image using the single view modeling method discussed in class. You may find the following resources useful:

Class lecture notes on projective geometry and single view modeling.
Original paper "Single View Metrology," by Criminisi, Reid, and Zisserman, ICCV 99.

Note that all of the above describe slightly different methods that you can use to compute the same information. Choose the one that you find the most natural and useful.

The steps of the project are:

Image acquisition
Calculate vanishing points
Choose reference points
Compute textures and 3-D positions and create a VRML model
Submit results

Image Acquisition

For this assignment you should take high resolution (preferably at least 800x800) images or scans of at least two different scenes. One of your images should be a sketch or painting. For instance, a photo of a Greek temple and a painting of Leonardo da Vinci's "The Last Supper" might be interesting choices. (We don't want everyone in the class to do these objects, however.) Note also that the object you digitize need not be monumental, or be a building exterior. An office interior or desk is also a possibility. At the other extreme, aerial photographs of a section of a city could also be good source material (you might have more occlusion in this case, necessitating some manual fabrication of textures for occluded surfaces). Be sure to choose images that accurately model perspective projection without radial distortions. You'll want to choose images that are complex enough to create an interesting model with at least ten textured polygons, yet not so complex that the resulting model is hard to digitize or approximate.

Calculating Vanishing Points

Choose a scene coordinate frame by defining lines in the scene that are parallel to the X, Y, and Z axis. For each axis, specify more than two lines parallel to that axis. The intersection of these lines in the image defines the corresponding vanishing point. A vanishing point may be "at infinity". Since the accuracy of your model depends on the precision of the vanishing points, implement a robust technique for computing vanishing points that uses more than two lines. Here is a write-up for a recommended method that extends the cross-product method discussed in class to return the best intersection point of 3 or more lines in a least squared sense, and helper code for eigen-decompositing symmetric matrices with example uses.

To compute vanishing points, choose line segments that are as long as possible and far apart in the image. Use high resolution images, and use the zoom feature to specify line endpoints with sub-pixel accuracy. A small number of "good" lines is generally better than many inaccurate lines. Use the "save" feature in your program so that you don't have to recalculate vanishing points every time you load the same image.

Choose Reference Points

You will need to set the reference points as described in lecture and in the write-ups. One way of doing this is to measure, in 3-D, when you shoot the picture, the positions of 4 points on the reference plane and one point off of that plane. The 4 reference plane points and their image projections define a 3x3 matrix H that maps u-v points to X-Y positions on the plane. The fifth point determines the reference height R off of the plane, as described in lecture. Alternatively, you can specify H and R without physical measurement by identifying a regular structure such as a cube and choosing its dimensions to be unit lengths. This latter approach is necessary for paintings and other scenes in which physical measurements are not feasible.

Compute 3D Positions

There are two different approaches for computing distances: in-plane measurements and out-of-plane measurements. You can combine these techniques to increase the power of the technique. For instance, once you have computed the height of one point X off of the reference plane P, you can compute the coordinates of any other point on the plane through X that is parallel to P (see the man on box slide from lecture). By choosing more than one reference plane, you can make even more measurements. Be creative and describe what you did to make measurements in your web page.

Compute Texture Maps

Use the points you have measured to define several planar patches in the scene. Note that even though your measurements may be in horizontal or vertical directions, you can include planes that are slanted, such as a roof.

The last step is to compute texture maps for each of these patches. If the patch is a rectangle in the scene, e.g., a wall or door, all that is needed is to warp the quadrilateral image region into a rectangular texture image. You can use the technique described in class to identify the best homography warp between the original and texture image, using the constraints that the four points of the quadrilateral map to the corners of the texture image. We recommend using inverse warping of the pixels in the texture image into pixels in the original image, and bilinear interpolation to evaluate fractional pixel values in the original image. It is best to choose the width and height of the texture image to be about the same as that of the original quadrilateral, to avoid loss of resolution. You may either write your own inverse warping code, or modify the warping code from project 2 to compute homographies instead of cylindrical projections.

If the patch is a non-rectangular region such as the outline of a person, you will need to perform the following steps: (1) define a quadrilateral in the image containing the region you want, (2) warp this into a rectangular texture image, as before, and (3) edit the texture image and mark out "transparent" pixels using your project 1 code or other image editing software.

Create a VRML model

For each image you work from, create a VRML model (see documentation below) with at least 10 texture-mapped polygonal faces. The skeleton code will create the VRML file for you but you need to add texture map images and masks for each polygon, in .gif or .jpg format.

Submit Results

Put your code and executable in the project 3 turnin directory, and your images and VRML models in the artifact directory with a web page project3.htm that contains:

source images, show them both in their original form and with annotations and marks to show which points and lines you digitized (e.g., from a screen shot of the user interface). Give details on where you got the image (name of building, book and page number, artist, etc)
a still image of a new view of the reconstructed scene, fairly far away from the input image.
some of your texture maps, show some of the more interesting ones, commenting on any hand retouching you did (perhaps show before and after retouching, if it was significant)
Include at least one non-quadrilateral object to make the scene more interesting.
VRML files--for each input image.
A description of your approach and analysis of the results.
Describe extensions that would be nice to include if you had more time.

The project was used in Image based Modeling and Rendering class I taught at Carnegie Mellon University, with Paul Heckbert. Click here for some of the best results.

Skeleton Code

We provide a skeleton code for you to start. The skeleton code provide an interface and several basic data structure for you to work with. We hope the skeleton code will save some labor for you. However, you don't have to use our skeleton code.

Bells and Whistles

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions. We're always interested in seeing new, unanticipated ways to use this program!

Show the camera position in each VRML file, marked by a sphere or other shape. We discussed how to obtain the height of the camera in lecture. The X position can be obtained the exact same way, using the vanishing line between the Y and Z vanishing points and a reference length parallel to the X axis (and similarly for the Y position).

Merging models from multiple images. For instance, create a complete model of a building exterior from a few photographs that capture all four sides..

Extend the method to create a 3D model from a cylindrical panorama. Hint: parallel lines in a panorama sweep out a curved path--you need to determine what this curve is.

Resources

Class lecture notes on projective geometry and single view modeling
Single view modeling web pages and paper by Antonio Criminisi and colleagues.
"Single View Metrology" paper by Criminisi, Reid, and Zisserman, ICCV 99. Also see other
There are also some projective geometry tutorials online.
VRML: The Virtual Reality Modeling Language, a file format for interactive 3-D models (a.k.a. virtual worlds) on the Internet. The VRML repository has specifications for the file format and information on free VRML plugins that permit a web browser to display VRML models. We recommend the Cortona browser, which is installed on the machines in the graphics lab. Make sure you install both the Browser and the 1.0 converter. A comparison of many VRML browsers is here. We have put a sample VRML file (it is a text file) online. If your browser has a VRML plugin (you can download these freely over the web), you should see a guy with sunglasses standing on a plank. The guy should look like a cardboard cutout (not a rectangle) if transparency is working. The two texture gif files, floor.gif and io.gif, are in the same directory. More on the VRML file format. Note that we'll only be using a fraction of VRML's capabilities.
Image Editing Tools: We recommend
- Photoshop on Macintosh, PC
- gimp on Unix
Be sure to choose an image editor that supports transparent gifs.

Last modified March 05, 2002