Computer Vision (CSE 490CV/EE 400B), Winter 2002

Project 2:  Single View Modeling

Assigned:  Wednesday, Feb 27, 2002
Due:  Tuesday, Mar 12, 2002 (by 11:59pm)

In this assignment you and your partner will create 3D texture-mapped models from a single image using the single view modeling method discussed in class.  You may find the following resources useful:

Note that all of the above describe slightly different methods that you can use to compute the same information.  Choose the one that you find the most natural and useful.

The steps of the project are:

  1. Image acquisition
  2. Calculate vanishing points
  3. Choose reference points
  4. Compute textures and 3-D positions and create a VRML model
  5. Submit results

Image Acquisition

For this assignment you should take high resolution (preferably at least 800x800) images or scans of at least two different scenes. One of your images should be a sketch or painting. For instance, a photo of a Greek temple and a painting of Leonardo da Vinci's "The Last Supper" might be interesting choices. (We don't want everyone in the class to do these objects, however.) Note also that the object you digitize need not be monumental, or be a building exterior. An office interior or desk is also a possibility. At the other extreme, aerial photographs of a section of a city could also be good source material (you might have more occlusion in this case, necessitating some manual fabrication of textures for occluded surfaces). Be sure to choose images that accurately model perspective projection without radial distortions. You'll want to choose images that are complex enough to create an interesting model with at least ten textured polygons, yet not so complex that the resulting model is hard to digitize or approximate.

Calculating Vanishing Points

Choose a scene coordinate frame by defining lines in the scene that are parallel to the X, Y, and Z axis. For each axis, specify more than two lines parallel to that axis. The intersection of these lines in the image defines the corresponding vanishing point. A vanishing point may be "at infinity". Since the accuracy of your model depends on the precision of the vanishing points, implement a robust technique for computing vanishing points that uses more than two lines. Here is a write-up for a recommended method that extends the cross-product method discussed in class to return the best intersection point of 3 or more lines in a least squared sense, and helper code for solving symmetric matrix equations.

To compute vanishing points, choose line segments that are as long as possible and far apart in the image. Use high resolution images, and use the zoom feature to specify line endpoints with sub-pixel accuracy. A small number of "good" lines is generally better than many inaccurate lines. Use the "save" feature in your program so that you don't have to recalculate vanishing points every time you load the same image.

Choose Reference Points

You will need to set the reference points as described in lecture and in the write-ups. One way of doing this is to measure, in 3-D, when you shoot the picture, the positions of 4 points on the reference plane and one point off of that plane. The 4 reference plane points and their image projections define a 3x3 matrix H that maps u-v points to X-Y positions on the plane. The fifth point determines the reference height R off of the plane, as described in lecture. Alternatively, you can specify H and R without physical measurement by identifying a regular structure such as a cube and choosing its dimensions to be unit lengths. This latter approach is necessary for paintings and other scenes in which physical measurements are not feasible.

Compute 3D Positions

There are two different approaches for computing distances: in-plane measurements and out-of-plane measurements. You can combine these techniques to increase the power of the technique. For instance, once you have computed the height of one point X off of the reference plane P, you can compute the coordinates of any other point on the plane through X that is parallel to P (see the man on box slide from lecture). By choosing more than one reference plane, you can make even more measurements. Be creative and describe what you did to make measurements in your web page.

Compute Texture Maps

Use the points you have measured to define several planar patches in the scene. Note that even though your measurements may be in horizontal or vertical directions, you can include planes that are slanted, such as a roof.

The last step is to compute texture maps for each of these patches. If the patch is a rectangle in the scene, e.g., a wall or door, all that is needed is to warp the quadrilateral image region into a rectangular texture image. You can use the technique described in class to identify the best homography warp between the original and texture image, using the constraints that the four points of the quadrilateral map to the corners of the texture image.  We recommend using inverse warping of the pixels in the texture image into pixels in the original image, and bilinear interpolation to evaluate fractional pixel values in the original image. It is best to choose the width and height of the texture image to be about the same as that of the original quadrilateral, to avoid loss of resolution.  You may either write your own inverse warping code, or modify the warping code from project 2 to compute homographies instead of cylindrical projections.

If the patch is a non-rectangular region such as the outline of a person, you will need to perform the following steps: (1) define a quadrilateral in the image containing the region you want, (2) warp this into a rectangular texture image, as before, and (3) edit the texture image and mark out "transparent" pixels using your project 1 code or other image editing software.

Create a VRML model

For each image you work from, create a VRML model (see documentation below) with at least 10 texture-mapped polygonal faces. The skeleton code will create the VRML file for you but you need to add texture map images and masks for each polygon, in .gif or .jpg format.

Submit Results

Put your code and executable in the project 3 turnin directory, and your images and VRML models in the artifact directory with a web page project3.htm that contains:

Skeleton Code

We provide a skeleton code for you to start. The skeleton code provide an interface and several basic data structure for you to work with. We hope the skeleton code will save some labor for you. However, you don't have to use our skeleton code. 

The interface allows you to load an image and add points, lines, and polygons. After you compute the 3D position of those points, you can save the model and reload it for further editing. When you are done, you can dump the model in VRML 2.0 format and view it in VRML viewer. 

The file IO related functions are under "File" submenu as usual. 

The "Edit" submenu, you have the following choices:

Point: add or delete points. To add a point, left click. To delete a point, move to the point till it is high lighted as white, then press "Backspace". The point can be deleted if it is not used by any other lines or polygons. 

X Line, Y Line, Z Line, Other Line: to add or delete lines. To add a line, the first left click defines the start point, and the second left click defines the end point. If you want to reuse one of the existing points as start/end point, just press "Ctrl" when you left click. To delete a line, move the mouse onto it till it becomes white and press "Backspace". In "X Line" edit mode, the lines you add are supposed to be parallel to the X axis in 3D. Similar meaning for the "Y Line" and "Z Line" mode. Lines added in "Other Line" mode may have any orientation. 

Polygon: add or delete polygons. Each polygon consists of a list of points. To add a polygon, you sequentially left click on desired positions and then press "Enter". A closed a polygon will be drawn. ( You don't have to click on the first point to make the polygon closed, the system automatically does it for you. ) To delete a polygon, move the mouse to the center of the polygon, shown as a white square, and press "Backspace". Every time you create a new polygon, you will give a name for it, e.g, "ceiling", "floor", which will be used as the texture file names when you save the model in VRML. The texture file name for a particular polygon will be the polygon name with ".gif" extension.

The "Draw" submenu, you can toggle the following options:

Points: draw points or not.

Lines: draw lines or not.

Polygons: draw polygons or not.

Draw 3D: draw it in 2D or 3D mode. 

When "Draw 3D" is not checked, all the image and points, lines, and polygons are drawn in image plane. You can edit them and

zoom in/out: Ctrl+/-;

move image: drag with right button;

When "Draw 3D" is checked, all the points, lines, and polygons are drawn in 3D (based on your computation of X,Y,Z for each point). The image is texture mapped onto the polygon (based on your estimation of homograph H, invH). You can not edit in this mode, but you can

scale up/down: Ctrl+/-;

move model parallel to the viewing plane: drag with left button;

move model further/closer: drag with left button upwards/downwards, with Alt down;

rotate around X: drag with left button vertically, with Ctrl down;

rotate around Y: drag with left button horizontally, with Ctrl down;

rotate clockwise/counterclockwise: drag with left button to the right/left, with Shift down;

Currently, there are 30 C++ files in skeleton codes: HelpPageUI.cpp/h, svmUI.cpp/h, ImgView.cpp/h, smvMain.cpp, svm.h, svmAux.h, PriorityQueue.h. 

HelpPageUI.cpp/h, svmUI.cpp/h: defines a help window and a main window respectively. 

svmAux.h, PriorityQueue.h: defines some auxiliary data structures and functions.

svmMain: defines the "main" function, which is simply a loop;

svm.h: includes several head files and defines the following important data structures:

struct SVMPoint {

double u,v,w;

double X,Y,Z,W;

};

typedef CTypedPtrDblList<SVMPoint> PointList;

where (u,v,w) is 2D Homogeneous Coordinates in image plane, and X, Y, Z, W are 3D Homogeneous Coordinates in 3D world. If w = 1, (u, v) is image coordinates, ranging from 0 to image width and 0 to image height respectively. If w=0 means the point is at infinity. Otherwise, (u/w, v/w) is  image coordinates. Similar means for X, Y, Z, W. 

struct SVMLine

int orientation;

SVMPoint *pnt1, *pnt2;

};

typedef CTypedPtrDblList<SVMLine> LineList;

where orientation indicates whether the line is supposed to be parallel to X, Y, Z axis or just any possible orientation in 3D. 

struct SVMPolygon {

CTypedPtrDblList <SVMPoint> pntList;

double cntx, cnty;

double H[3][3],invH[3][3];

char name[256];
};

typedef CTypedPtrDblList<SVMPolygon> PolygonList;

where each polygon consist of a list of SVMPoint and the pointers to the SVMPoints are saved in pntList. 
(cntx, cnty) is the mean of all points in the list, used for polygon selection in UI. H is the homography from normalized texture image of this polygon to the original image; that is, if the INVERSE of H is applied to the image coordinates (u,v,w) in the pntList, the result is the texture coordinates, ranging between [0,1]. invH is the inverse matrix of H. H is used when generating texture images from original image. invH is used to convert image coordinates in pntList to texture coordinates. Whenever you change H, please update invH using Matrix3by3Inv function in svmAux.h. 

name is the name of the polygon. name.gif will be used as texture file name for VRML file, which will be explained later. 

ImgView.cpp/h: defines and implements imgView class, which handles most of the UI messages and drawing routines. You will work with the follow member data:

PointList pntList;

LineList lineList;

PolygonList plyList;

which save all the points, lines, and polygons you create. 

This project is not the same as previous one in a sense that you are not given several TODOs, each of which you just fill in several sentences. Instead, you are given a goal to generate 3D textured models. To achieve the goal, we recommend you to follow the 5 steps mentioned above. It is up to you how to organize your code based our skeleton code. As of the 5 steps, they can be divided into two stages: 

A. Estimate the geometry of the model. 

The goal of this stage is to compute the X,Y,Z,W fields for each SVMPoint. It involves calculate vanishing points, and choose reference points. This has been covered in Steve's lectures;

B. Compute the texture image for each polygon. 

Based on the 3D point positions, you need to compute the homography from the polygon plane to the image plane and then resample the original image to generate texture image. You need to fill in code in the skeleton code to do this ! As for the naming convention for texture images, if the polygon has a name "wall", the texture image name should be "wall.tga". "wall.tga" maybe contain something more than a wall, you want to use your scissor programming to cut the wall out of its background. Based on the mask from your scissor and wall.tga, you want to generate a wall.gif with Photoshop, in which background are transparent and the foreground are opaque. If you do this for all the polygons, and save the model as VRML. The skeleton code will generate a VRML file, using polygon's name with ".gif" extension as texture image filename. That's the reason use want to follow my naming convention: wall-->wall.tga-->wall.gif ! If you put the VRML file and all *.gif texture image files under the same directory, you can view it with a VRML viewer. Here is a detailed document about computing homography with some helper codes to solve symmetric linear equations Ax=b, where A is symmetric. 

The "Tool" submenu is empty. You can put what ever tools you invented to achieve the single view modeling goal. 

Bells and Whistles

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions. We're always interested in seeing new, unanticipated ways to use this program!

[whistle]Show the camera position in each VRML file, marked by a sphere or other shape.  We discussed how to obtain the height of the camera in lecture.  The X position can be obtained the exact same way, using the vanishing line between the Y and Z vanishing points and a reference length parallel to the X axis (and similarly for the Y position).

[bell] Merging models from multiple images. For instance, create a complete model of a building exterior from a few photographs that capture all four sides..

[bell][bell] Extend the method to create a 3D model from a cylindrical panorama.  Hint:  parallel lines in a panorama sweep out a curved path--you need to determine what this curve is.


Resources


Last modified February 28, 2002