A range image is, in principle, a regular (i,j) grid of points with a depth at every pixel and a projection procedure for taking a point in 3D and mapping (projecting) it into to grid coordinates. In practice, the projection may only be approximately known, and there may be some irregularity to the arrangement of the points (e.g., every other line may be shifted due to interlacing). For this reason, we represent a range image as a grid of points in 3D (i.e., an (i,j) grid with an x, y, and z coordinate at every pixel) that is roughly observed with a view direction looking down the z-axis. The data structure that stores this grid of points is a RangeGrid and can be read in from a ply file with RangeGrid::readPly().
Note: in this writeup we use the terms "range" and "depth" interchangeably. However, they are often defined differently--range can be defined to be the distance to the scene point along a viewing ray from the optical center, whereas depth is the component of range in the direction of the z-axis. For this assignment, it is more convenient to store depth values instead of true range values, because these are easily computed via matrix projection. Consequently, we when say "range", we mean "depth" (z values).
Scanalyze allows you to align a set of range images to each other. After the alignment is complete, a set of rigid body transformations are generated, one for each range image. Note that the range image data are left in their original coordinate systems. We have provided code that reads in these tranformations: readXform() in xfio.h.
If the projection for a range image is only approximately known or if there is some irregularity to the point structure, then it will make sense to resample the data to a regular grid as described in the following steps.
If, however, you know the projection to high accuracy, and your points lie along projection rays that pierce a projection plane to form a rectilinear grid, then you do not need to resample your data. For convenience, you should still build a depth map (see DepthMap.h) that stores the z-values of your points.
In order to ensure a completely regular range image, we will actually read in the range image and resample it onto a rectilinear grid. The first step in this process is creation of a "range surface." A range surface is a straightforward triangulation of the original range image that avoid creating triangles across step edges if desired. The code is currently configured to skip triangles that have normals whose z-component is too small (assuming the viewer is looking down the z-axis).
A range surface is really just a triangle mesh, as described in Mesh.h, and the function that creates a mesh from a range grid is meshFromGrid() in meshAux.cc.
Once you have a range surface, you can resample it onto a rectilinear grid using a simple Z-buffer algorithm. You will need to allocate the Z-buffer, which we'll call a depth map (see DepthMap.h) and then define the projection operator (see Projector.h) used in projecting a 3D point into the depth map. Note that this "projection" is just a tranformation that will cause (x,y) coordinates to map onto the depth map grid while maintaining Z-values to be interpolated later.
Once the depth map and projection are set up, you should tranform the range surface with the projection operator and then pass the depth map and range surface to the softRender() routine in softRender.cc. This routine essentially performs a rasterization of the range surface triangles.
Note that, for testing purposes, you can use DepthMap::buildRangeGrid() to convert the depth map into a range grid object which can then be written to a ply file using RangeGrid::writePly(). For this to work, you will also need to define the inverse projection operation Projector::unproject().
You should free up memory wherever you can. The RangeGrid and (especially) the Mesh data structures can be costly in storage space, and if you load in a bunch of scans, you'll find yourself swapping to disk sooner than you'd like.
The surface reconstruction that you will compute corresponds to a weighted blend of range images. In fact, the weighting will be applied on a "per-range-image-point" basis. As such, each range image point needs a weight, which you can compute directly from the depth map you derive from resampling the range surface. There are three kinds of weights to consider: sampling rate, boundary proximity, and scanner- specific confidence.
Sampling rate: Given a piece of surface seen from two points of view, it is desirable to give greater weight to the point of view that is more directly facing the surface, i.e., the point of view with the greater sampling rate over the surface. In general, the surface will be curved, so which point of view has a greater sampling rate depends on where you are on the surface. A simple metric is to compute the normal to the surface at each range point, take the dot product with the viewing direction (setting negative values to zero), and use this as the weight. From the depth map, you can devise a metric based on the gradient at each depth sample.
Boundary proximity: boundary proximity measures how near a range point is to the edge of the scan. This weight is needed to ensure smooth blends near the boundary of range images and can be approximated in the depth map by computing how many pixels a given depth point is from an edge of the scan. One way of defining the edge pixels is to look in each 2x2 neighborhood and see if there is a change in depth that is above some threshold. If so, the pixels in the neighborhood can be tagged as edge pixels. We then recommend computing the distance of points to the scan edge using a distance transform. While there are much faster distance transform algorithms, a simple method is to initialize all pixels to some large number and mark all edge pixels as distance 0 and the remaining pixels as distance D. Then for each pixel that has distance k, mark it's neighbors as having distance k+1 (unless they already have a distance assigned that is less than k+1) and recurse D-1 times. You can save time by only applying the edge weight for pixels that are close to an edge. The boundary weight is then minBoundaryWeight+distance/D. The minBoundaryWeight is a slight bias to prevent you from completely ignoring depth samples near the boundary.
Scanner-specific confidence: If you have any prior knowledge about the scanning system's limitations, then you can incorporate it into the weight computations.
The resulting weight per depth sample will then be the product of all these weight terms.
The method of isosurface extraction that you will implement is known as a continuation method; i.e., starting from a point on the isosurface, you can grow the rest of the surface by visiting voxels that are near the current surface. The idea is that you maintain a queue of cubes to visit. So, a typical sequence of events would be to remove a cube from the queue, extract triangles, and then add neighboring cubes that share edges with signed distance crossings to the queue.
To avoid recomputing values and zero crossings around voxels contained in previously processed cubes, you will also need maintain a hash table. In addition to keeping track of which cubes have been processed, the hash table should store cached values of useful information that has already been computed for that cube and that might be shared with neighbors, notably the voxel values, indices of vertices that were computed at zero crossings, and possibly gradients.
To bootstrap the continuation method of isosurface extraction, you need to find a seed point on each connected component of the isosurface. For this project, you may assume that there is only one connected component. To find the seed point, we recommend choosing a depth map sample with high confidence, determining it's coordinates in world space, and searching the voxels in the neighborhood of that point until a sign change is detected. Brute force search is one approach, following the gradient might be (marginally) better.
After projecting a voxel into a depth map, you need to interpolate the depths and weights stored in the depth map. The interpolated depth can be used to compute the signed distance. You should avoid interpolating among depths that are marked as edges.
For any given voxel, you can compute a signed distance with respect to every range image. However, some of these signed distances can be quite large in magnitude even if a voxel is on the surface, due to occlusions or noise. A simple way to avoid combining such estimates is to omit signed distances that are greater than a certain absolute value. In principle, this threshold should be set to a value commensurate with the amount of noise in the range data.
One case that will not be handled by this heuristic is thin surfaces. In this case, a voxel may receive signed distance contributions from opposite sides of the surface. Extra credit goes to anyone who comes up with a nice way of handling this case!
Bill Lorensen, one of the authors of the Marching Cubes SIGGRAPH paper has provided us with his edge table that prevents inconsistent geometry from being generated. You will find a complete description of the edge table data structure and how it relates to a cube of voxels in mc.h.
For each edge with oppositely signed voxel values on either end, you will need to determine the sub-voxel location of the zero crossing by linear interpolation. In addition, you should linearly interpolate the weight for later visualization (the resulting mesh can be viewed with per-vertex colors derived from the weights).
Optionally, you can also linearly interpolate the gradient value and use the resulting gradient, normalized, as the surface normal at that point. In order to compute gradients at voxel corners, however, you will in general have to visit neighboring voxels not contained within the current cube.
As you generate vertices and triangles, you will need to dynamically allocate space. In addition, whenever you a triangle uses a vertex that has previously been generated by another cube, you should not add that vertex to the mesh again. Rather, you should just reuse its index. This last step is important for generating a concise mesh that is itself a single connected component.