We implemented Seam Carving, a technique described in the Siggraph 2007 paper by Shai Avidan and Ariel Shamir to resize images without distorting the primary content of the images. This technique works by removing the least noticeable pixel-wide path across or up-and-down an image, and doing so numerous times until the image is of the desired smaller size.
We would hope that in resizing, we would eliminate the sky and the sand, leaving the pyramids unaffected. We include visualizations of the optimal seams corresponding to each row and column (the images with many red lines), and visualizations of the lowest cost horizontal and vertical seam (the images with a single red line). Note that the best seams avoid the pyramids entirely.
We also include a visualization of the calculated energy, or importance, of each pixel, with brighter red representing greater energy, and thus, pixels that should remain in the downsized image.
We can remove horizontal seams until the sky is no longer visible above the tips of the pyramids, as desired. However, continued horizontal seam removal "squishes" the tips of the pyramids, instead of removing seams through the sand as we might like. This is because the sand is somewhat noisy, and though the individual energies along the edges of the pyramids are greater than any individual energies in sand pixels, a lower total seam cost is obtained by crossing the low-energy sky for the majority of the seam, crossing the high-energy pyramids for only a fraction of the seam.
We see a similar effect when removing vertical seams to shrink the image horizontally. Initial seams are removed without noticeable effect, but as we continue to remove vertical seams, the smaller pyramids on the sides of the image experience distortion. Note that due to their height and significant intensity variation, the larger, central pyramids remain nearly unaffected.
Here, we show another example of the mixed performance obtained through seamcarving. The removal of initial horizontal seams removes the background, leaving the frog intact and undistorted. However, continuing to remove horizontal seams distorts the frog significantly. Note that the copyright text imbedded in the image has enough energy to repel the seams almost completely. It is worth asking whether there is even a reasonable way to represent this image at such a different aspect ratio, so perhaps this is not a failure of seamcarving so much as unreasonable expectations.
If one wants to reduce both the height and width of an image using seamcarving, there is not a clearly best method. In this example, we wish to resize our 500x375 pixel image to a 400x300 image. This means we will need to remove 100 vertical seams and 75 horizontal seams. However, a different image is obtained depending on the order we do this removal. One option is to remove all 75 horizontal seams followed by all 100 vertical seams. Another is to alternate horizontal and vertical, then obtaining the result by removing the remaining 25 vertical seams. A variant of this alternation technique (not shown) would be to alternate horizontal and vertical in the correct ratio to reach the new aspect ratio without leftovers.
We implemented Optimal Resizing as a feature. This computes the order of horizontal and vertical seam removals which minimizes the sum of the cost of all seams removed. This is accomplished through another layer of dynamic programming (in addition to the dynamic programming used to compute the optimal seams themselves). This is quite memory intensive, because many copies of intermediate images must be stored in memory, especially if we are performing a significant change in both dimensions. However, as the example shows, the result is the most visually pleasing and least distorted of our three alternatives, if only slightly so.
Order of images (clockwise from upper-left): original, optimal, alternating vertical and horizontal, all horizontal then all vertical.
Though removing seams one at a time in either direction is accomplished relatively quickly, it does not occur at real-time speed. To facilitate real-time image resizing in a single dimension, an index map can be pre-computed for a given image. This index map is computed by repeatedly removing seams from an image, and recording for each pixel the index of the seam that removed it (e.g., the pixels from the first seam would be 0, the second seam 1, and so on). This index map is represented visually by assigning brighter red colors to pixels that are removed by earlier seams. Notice that the brighter red corresponds to the low-energy background, and the dark region is a rough silhouette of the animals head (and comical ears). This behavior is observed in our seam-carved result, where vertical seams are removed automatically from both sides of the head, leaving the high-energy foreground unaffected.
Seam carving is very effective at removing blurry or constant parts of the image. However, if the whole image has high energy, the removal of even a small number of seams can cause noticeable distortion.
Straight lines in the image can become warped as seams will repeatedly remove pixels from the lowest energy part of the line. Perhaps an edge detector that accounts for the length and straightness of an edge could avoid this. This could be implemented as a more complex energy function. Also, using randomization of seam selection (as opposed to the optimal seam selected in our implementation), the same location along an edge would not be repeatedly selected, and removal could be naturally spread out along the edge for less noticeable artifacts. Some measure of interactivity (such as the weighting brush described in the paper) could also help here.
In general, seam carving will "hammer" specific low energy areas of the image, which though unnoticeable in relatively constant regions, is quite apparent along edges. An energy function that took into account the locations of previously removed seams could offset this tendency, and would be interesting to explore.
Seam carving video in a temporally coherent fashion would be an interesting challenge. Perhaps there is something analogous to our index map that could be used to play back videos at various aspect ratios.