Link

Seam Carving

Implement a content-aware image resizing algorithm.1

Learning goals

  • Solve a real-world graph problem via reduction.

    Reduction is an important tool in computer science theory, but it also has practical applications for solving real-world problems. One of the building blocks of computer science is the idea of abstraction, and reductions represent one of the strongest forms of abstractions by using the solution to one problem as a “black box” for solving a different but somehow related problem.

  • Represent problem solutions as states in a graph.

    In class, we presented several graph algorithms for solving a variety of classical graph problems including shortest paths and minimum spanning trees. But in the real-world, problems rarely immediately jump out as solvable with graph algorithms. How we represent the problem as a graph is one of the most important decisions as it determines which algorithms we can employ to solve the problem.

  • Apply graph augmentation as a strategy for solving a new class of graph problems.

    After coming up with a graph representation of the problem (an abstraction), we can then formally define the problem in terms of the abstraction. However, our problem still might be hard to solve! In seam carving, we need to find the weighted shortest path from any vertex in a distinguished set S to another set T. One common strategy for solving these problems is by augmenting the graph with additional nodes or edges.

Table of contents

  1. Overview
  2. Getting the Assignment
  3. Seam Carving Algorithm
    1. Dual-Gradient Energy Function
    2. Finding a Vertical Seam
    3. Finding a Horizontal Seam
    4. Runtime Requirements
    5. Tips
    6. Testing
  4. Submission

Due Date

Please be forewarned that we will not accept late submissions beyond the 9pm due date because of the grade submission deadline.

Overview

Seam-carving is a content-aware image resizing technique where the image is reduced in size by one pixel of height (or width) at a time. A vertical seam in an image is a path of pixels connected from the top to the bottom with one pixel in each row; a horizontal seam is a path of pixels connected from the left to the right with one pixel in each column. Below left is the original 505-by-287 pixel image; below right is the result after removing 150 vertical seams, resulting in a 30% narrower image. Unlike standard content-agnostic resizing techniques (such as cropping and scaling), seam carving preserves the most interest features (aspect ratio, set of objects present, etc.) of the image.

Although the underlying algorithm is simple and elegant, it was not discovered until 2007. Now, it is now a core feature in Adobe Photoshop and other computer graphics applications.

Josh Hug Ocean

Josh Hug Ocean 357x285

In this assignment, you will create a data type that resizes a WW-by-HH image using the seam-carving technique. Finding and removing a seam involves three parts and a tiny bit of notation.

Getting the Assignment

Task
Pull the skeleton repository to get the seamcarving assignment.

If IntelliJ doesn’t properly recognize the new files or complains about failing to resolve classes or interfaces from previous assignments, refresh Gradle manually through the Gradle tool window (on the right by default):

Gradle Refresh

Seam Carving Algorithm

Background
Read through the description of the seam carving algorithm.

Notation: In image processing, pixel (x,y)(x, y) refers to the pixel in column xx and row yy, with pixel (0,0)(0, 0) at the upper-left corner and pixel (W1,H1)(W - 1, H - 1) at the lower-right corner. This is consistent with the Picture data type that we use in this assignment.

A 3-by-4 image.

(0,0)(0, 0)(1,0)(1, 0)(2,0)(2, 0)
(0,1)(0, 1)(1,1)(1, 1)(2,1)(2, 1)
(0,2)(0, 2)(1,2)(1, 2)(2,2)(2, 2)
(0,3)(0, 3)(1,3)(1, 3)(2,3)(2, 3)
Warning
This is the opposite of the standard mathematical notation used in linear algebra, where (i,j)(i, j) refers to row ii and column jj and (0,0)(0, 0) is at the lower-left corner.

We also assume that the color of each pixel is represented in RGB space, using three integers between 0 and 255. This is consistent with the Color data type.

  1. Energy calculation. The first step is to calculate the energy of a pixel, which is a measure of its importance—the higher the energy, the less likely that the pixel will be included as part of a seam (as you will see in the next step). In this assignment, you will use the dual-gradient energy function, which is described below. Here is the dual-gradient energy function of the surfing image above:

    Josh Hug Ocean Gradient

    The energy is high (white) for pixels in the image where there is a rapid color gradient (such as the boundary between the sea and sky and the boundary between the surfing Josh Hug on the left and the ocean behind him). The seam-carving technique avoids removing such high-energy pixels.

  2. Seam finding. The next step is to identify a vertical seam of minimum total energy. (Finding a horizontal seam is analogous.) This is similar to the classic shortest path problem in an edge-weighted digraph, but there are three important differences:

    • Each edge weight is based on a vertex (pixel) rather than the edge itself.
    • The goal is to find the shortest path from any of the WW pixels in the top row to any of the WW pixels in the bottom row.
    • The digraph is acyclic, where there is a downward edge from pixel (x,y)(x, y) to pixels (x1,y+1)(x - 1, y + 1), (x,y+1)(x, y + 1), and (x+1,y+1)(x + 1, y + 1), assuming that the coordinates are in the prescribed ranges.

    Seams cannot wrap around the image (e.g., a vertical seam cannot cross from the leftmost column of the image to the rightmost column).

    Josh Hug Ocean Vertical Seam

  3. Seam removal. The final step is remove from the image all of the pixels along the vertical or horizontal seam. This has already been implemented for you in the SeamCarver class.

Dual-Gradient Energy Function

Task
Implement DualGradientEnergyFunction using the following math.

For this assignment, we are going to define the energy of a pixel to be the magnititude of the rate of change of colors at that pixel. In other words, a high energy pixel is one that has a large change from its neighbors while a low energy pixel has very little change between itself and its neighbors. The way we are actually going to calculate this value is by approximating the derivative in the X and Y directions then combine these using the pythagorean theorem to get the overall change.

To start, we need to figure out how to calculate the derivative which in more than one dimension is called the gradient. Usually, when we find derivatives we have a continuous function to look at, but here we only have the descrete pixels making up our picture. This means we will have to use something called numerical differentiation that (luckily) some very smart mathematicians have already figured out for us. For our assignment, we will be using two formulas to calculate the gradient. The first is called the Central Difference and is defined as follows:

fx(x,y)f(x+1,y)f(x1,y) f_x(x, y) \approx f(x + 1, y) - f(x - 1, y)
fy(x,y)f(x,y+1)f(x,y1) f_y(x, y) \approx f(x, y + 1) - f(x, y -1)

Where fxf_x and fyf_y represent the gradient of ff in the xx direction and yy direction respectively. These functions are very useful if we know the values of ff surrounding a point (x,y)(x, y), but if we do not know or cannot calculate f(x1,y)f(x - 1, y) we cannot use these functions. This means our approximation method won’t work on the edges of our picture where we don’t have pixels on all sides. Instead, we use a second approximation called the Forward/Backward Difference which is defined as:

fx(x,y)3f(x,y)+4f(x+δ,y)f(x+2δ,y) f_x(x, y) \approx -3 f(x, y) + 4 f(x + \delta, y) - f(x + 2 \delta, y)
fy(x,y)3f(x,y)+4f(x,y+δ)f(x,y+2δ) f_y(x, y) \approx -3 f(x, y) + 4 f(x, y + \delta) - f(x, y + 2 \delta)

Where δ\delta can be either 1 or -1. When it is 1 we call it the Forward Difference because the points are increasing and when it is -1 we call it the Backward Difference since the points are decreasing. As you can see, this allows us to calculate the approximate value of fxf_x and fyf_y on the edges of the picture since our formula only involves points on one side of x and y. One note for pedantics is that the true value of the gradient is actually scaled and may have a different sign from what we are calculating, but since we will eventually be squaring these numbers and only care about the relative values, this ends up not impacting our calculations.

Now, we need to use these approximations to calculate the overall energy at a pixel. First, we will use R(x,y)R(x,y), G(x,y)G(x,y), and B(x,y)B(x,y) to represent the red, green, and blue values of the pixel located at (x,y)(x,y). Each one of these function also has an xx and yy gradient like our function ff above which we will denote using the same subscripts of xx and yy. We can now define the square of the total rates of change in the xx and yy direction to be:

Δx2(x,y)=Rx(x,y)2+Gx(x,y)2+Bx(x,y)2 \Delta_x^2(x, y) = R_x(x, y)^2 + G_x(x, y)^2 + B_x(x, y)^2
Δy2(x,y)=Ry(x,y)2+Gy(x,y)2+By(x,y)2 \Delta_y^2(x, y) = R_y(x, y)^2 + G_y(x, y)^2 + B_y(x, y)^2

Where RxR_x and the related functions can be approximated using the method above. Then, for the final part, we need to use the Pythagorean theorem to combine these two values to get the magnitude of the change which gives us our final formula for energy:

energy(x,y)=Δx2(x,y)+Δy2(x,y) \operatorname{energy}(x, y) = \sqrt{\Delta_x^2(x, y) + \Delta_y^2(x, y)}

As an example, consider the 3-by-4 image with RGB values (each component is an integer between 0 and 255) as shown in the table below. (This is the 3x4.png image in your seamcarving/data/ folder.)

Dual Gradient Computation

Example 1

The energy of the non-border pixel (1,2)(1, 2) is calculated using the central difference from pixels (0,2)(0, 2) and (2,2)(2, 2) for the x-gradient

Rx(1,2)=255255=0R_x(1, 2) = 255 - 255 = 0
Gx(1,2)=205203=2G_x(1, 2) = 205 - 203 = 2
Bx(1,2)=25551=204B_x(1, 2) = 255 - 51 = 204

yielding Δx2(1,2)=22+2042=41620\Delta_x^2(1, 2) = 2^2 + 204^2 = 41620;

and pixels (1,1)(1, 1) and (1,3)(1, 3) for the y-gradient

Ry(1,2)=255255=0R_y(1, 2) = 255 - 255 = 0
Gy(1,2)=255153=102G_y(1, 2) = 255 - 153 = 102
By(1,2)=153153=0B_y(1, 2) = 153 - 153 = 0

yielding Δy2(1,2)=1022=10404\Delta_y^2(1, 2) = 102^2 = 10404.

Thus, the energy of pixel (1,2)(1, 2) is 41620+10404=52024\sqrt{41620 + 10404} = \sqrt{52024}.

What is the energy of pixel (1, 1)?

The energy of pixel (1,1)(1, 1) is 2042+1032=52225\sqrt{204^2 + 103^2} = \sqrt{52225}.

Example 2

The energy of the border pixel (0,3)(0, 3) is calculated by the forward difference using pixels (0,3)(0, 3), (1,3)(1, 3), and (2,3)(2, 3) for the x-gradient since it is on the left edge of the image

Rx(0,3)=3255+4255255=0R_x(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Gx(0,3)=3255+4255255=0G_x(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Bx(0,3)=351+4153255=204B_x(0, 3) = -3 \cdot 51 + 4 \cdot 153 - 255 = 204

yielding Δx2(0,3)=2042=41616\Delta_x^2(0, 3) = 204^2 = 41616.

Similarly, for the y-gradient we will use the backward difference using pixels (0,3)(0, 3), (0,2)(0, 2), and (0,1)(0, 1) since the point is on the bottom edge of the image.

Ry(0,3)=3255+4255255=0R_y(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Gy(0,3)=3255+4203153=106G_y(0, 3) = -3 \cdot 255 + 4 \cdot 203 - 153 = -106
By(0,3)=351+45151=0B_y(0, 3) = -3 \cdot 51 + 4 \cdot 51 - 51 = 0

yielding Δy2(0,3)=(106)2=11236\Delta_y^2(0, 3) = (-106)^2 = 11236.

Thus, the energy of pixel (0,3)(0, 3) is 41616+11236=52852\sqrt{41616 + 11236} = \sqrt{52852}.

Summary

The energy for each pixel is given in the table below.

√52852√52641√52432
√52020√52225√52432
√52024√52024√52024
√52852√52020√51220

Finding a Vertical Seam

Task
Implement SeamFinder.findVerticalSeam by reducing the problem to one solvable using Dijkstra’s algorithm.

The findVerticalSeam method returns an array of length HH such that entry yy is the column number of the pixel to be removed from row yy of the image. Consider the 6-by-5 image (6x5.png) with RGB values shown in the table below.

(78, 209, 79)(63, 118, 247)(92, 175, 95)(243, 73, 183)(210, 109, 104)(252, 101, 119)
(224, 191, 182)(108, 89, 82)(80, 196, 230)(112, 156, 180)(176, 178, 120)(142, 151, 142)
(117, 189, 149)(171, 231, 153)(149, 164, 168)(107, 119, 71)(120, 105, 138)(163, 174, 196)
(163, 222, 132)(187, 117, 183)(92, 145, 69)(158, 143, 79)(220, 75, 222)(189, 73, 214)
(211, 120, 173)(188, 218, 244)(214, 103, 68)(163, 166, 246)(79, 125, 246)(211, 201, 98)

The minimum energy vertical seam is highlighted in blue. In this case, the method findVerticalSeam returns the array [4, 4, 3, 2, 2] because the pixels in the minimum energy vertical seam are (4,0),(4,1),(3,2),(2,3),(2,4)(4, 0), (4, 1), (3, 2), (2, 3), (2, 4).

Vertical Seam

When there are multiple vertical seams with minimal total energy, your method can return any such seam.

Finding a Horizontal Seam

Task
Implement SeamFinder.findHorizontalSeam, which should look very similar to SeamFinder.findVerticalSeam.

The behavior of findHorizontalSeam is analogous to that of findVerticalSeam except that it should return an array of length WW such that entry xx is the row number of the pixel to be removed from column xx of the image. For the 6-by-5 image, the method findHorizontalSeam returns the array [2, 2, 1, 2, 1, 1] because the pixels in the minimum energy horizontal seam are (0,2),(1,2),(2,1),(3,2),(4,1),(5,1)(0, 2), (1, 2), (2, 1), (3, 2), (4, 1), (5, 1).

Horizontal Seam

Runtime Requirements

Task
Meet the following runtime requirements:
  • The runtime of energy should be in Θ(1)\Theta(1) because it is a closed-form expression.
  • The runtime for finding either the vertical seam or the horizontal seam should both be in O(WHlog(WH))\mathcal{O}(WH \log (WH)).

Tips

Recommended
Read these.

We’re reducing the seam carving problem to a shortest path problem that we can solve using Dijkstra’s algorithm; this has a number of implications:

  1. General reduction notes
    • Before using AStarPathFinder, we need to preprocess the problem into a format AStarPathFinder can use.
    • After using AStarPathFinder, we need to postprocess its output into the format the SeamFinder interface needs.
  2. Preprocessing for AStarPathFinder
    • The path finder needs a graph to run on, so you’ll need to implement an AStarGraph for this reduction.
      • Here’s a rough specification for finding vertical seams:
        • Vertices represent pixels.
        • Edges exist from a pixel to its 3 downward neighbors.
        • Edges have weight representing energy.
        • The shortest (least total weight) path from top to bottom represents the minimum-energy seam.
      • It might also be useful to look at the example graph implementations from the A* assignment. Notice that the graphs for puzzles don’t actually store a complete representation of the graph, instead opting to lazily generate the output for neighbors when the method gets called—this reduces the overall memory usage and runtime for large graphs which might otherwise cause Java to run out of memory.
    • [!] The path finder needs a single vertex to start on, and a single vertex to end on. In the seam carving problem, there are multiple possible first and last pixels to choose, so we’ll need to do something else in the graph to allow the path finder to choose any valid seam. This is an important but unintuitive observation for this assignment!

Testing

Recommended
Read these notes about the testing and grading.

We have not provided many unit tests with this assignment. Instead, we’ve included multiple images of various sizes in the seamcarving/data/ folder, along with the expected energies and seams indicated in text files. Feel free to run the provided demo applications and manually check your results against the expected ones in the text files, or practice writing unit tests by converting the expected results in the text files into equivalent tests.

Additionally, like the previous assignment, this assignment includes a method that creates a ShortestPathFinder, which we will override on the grader to use the solution implementation. We recommend using this so that you don’t get penalized again for any bugs or inefficiencies left over in your previous assignments.

Submission

Commit and push your changes to GitLab before submitting your homework to Gradescope.

🥳 Congratulations on completing all the assignments!

  1. Josh Hug. 2015. Seam Carving. In Nifty Assignments 2015. http://nifty.stanford.edu/2015/hug-seam-carving/

    Josh Hug, Kevin Wayne, and Maia Ginsburg. 2019. Seam Carving. In COS 226, Spring 2019. https://www.cs.princeton.edu/courses/archive/spring19/cos226/assignments/seam/specification.php