Link

Introduction and Energy Calculation

Determining the importance of pixels in an image.

Table of contents

  1. Seam Carving Algorithm
    1. Notation
    2. Overview
  2. Dual-Gradient Energy Function
    1. Runtime Requirements

Seam Carving Algorithm

Background
Read through the description of the seam carving algorithm.

Seam-carving is a content-aware image resizing technique where the image is reduced in size by one pixel of height (or width) at a time. A vertical seam in an image is a path of adjacent or diagonally-adjacent pixels connected from the top of the image to the bottom, with one pixel in each row; a horizontal seam is the same from left to right. (Seams cannot wrap around the image: e.g., a vertical seam cannot cross from the leftmost column of the image to the rightmost column.) Below left is the original 505-by-287 pixel image; below right is the result after removing 150 vertical seams, resulting in a 30% narrower image. Unlike standard content-agnostic resizing techniques (such as cropping and scaling), seam carving preserves the most interest features (aspect ratio, set of objects present, etc.) of the image.

Although the underlying algorithm is simple and elegant, it was not discovered until 2007. Now, it is now a core feature in Adobe Photoshop and other computer graphics applications.

Josh Hug Ocean

Josh Hug Ocean 357x285

In this assignment, you will create implement an algorithm that resizes a WW-by-HH image using the seam-carving technique.

Notation

In image processing, pixel (x,y)(x, y) refers to the pixel in column xx and row yy, with pixel (0,0)(0, 0) at the upper-left corner and pixel (W1,H1)(W - 1, H - 1) at the lower-right corner. This is consistent with the Picture data type that we use in this assignment.

A 3-by-4 image.

(0,0)(0, 0)(1,0)(1, 0)(2,0)(2, 0)
(0,1)(0, 1)(1,1)(1, 1)(2,1)(2, 1)
(0,2)(0, 2)(1,2)(1, 2)(2,2)(2, 2)
(0,3)(0, 3)(1,3)(1, 3)(2,3)(2, 3)
Warning
This is the opposite of the standard mathematical notation used in linear algebra, where (i,j)(i, j) refers to row ii and column jj and (0,0)(0, 0) is at the lower-left corner.

We also assume that the color of each pixel is represented in RGB space, using three integers between 0 and 255. This is consistent with the Color data type.

Overview

Finding and removing a seam involves three parts:

  1. Pre-processing: The first step is to pre-process the image to determine the importance of each pixel. For each pixel, we calculate an energy, a measure of its distinctness from its adjacent pixels; the higher the energy, the more important it is. In this assignment, you will use the dual-gradient energy function, which is described below. Here are the values of the dual-gradient energy function on the surfing image above:

    Josh Hug Ocean Gradient

    The energy is high (white) for pixels in the image where there is a rapid color gradient (such as the boundary between the sea and sky and the boundary between the surfing Josh Hug on the left and the ocean behind him). The seam-carving technique will avoid removing such high-energy pixels.

  2. Seam finding: The next step is to identify a seam of minimum total energy. The direction of the seam depends on the dimension being resized: resizing horizontally requires vertical seams, and resizing vertically requires horizontal seams. Seam finding is similar to the classic shortest path problem in a weighted directed graph, but with some key differences that will be discussed later.

    Josh Hug Ocean Vertical Seam

  3. Seam removal: The final step is remove from the image all of the pixels along the vertical or horizontal seam. This has already been implemented for you in the SeamCarver class.

Dual-Gradient Energy Function

Task
Implement DualGradientEnergyFunction using the following math.

For this assignment, we are going to define the energy of a pixel to be the magnitude of the rate of change of colors at that pixel. In other words, a high energy pixel is one that has a large change from its neighbors while a low energy pixel has very little change between itself and its neighbors. The way we are actually going to calculate this value is by approximating the derivative in the X and Y directions then combine these using the pythagorean theorem to get the overall change.

To start, we need to figure out how to calculate the derivative which in more than one dimension is called the gradient. Usually, when we find derivatives we have a continuous function to look at, but here we only have the descrete pixels making up our picture. This means we will have to use something called numerical differentiation that (luckily) some very smart mathematicians have already figured out for us. For our assignment, we will be using two formulas to calculate the gradient. The first is called the Central Difference and is defined as follows:

fx(x,y)f(x+1,y)f(x1,y) f_x(x, y) \approx f(x + 1, y) - f(x - 1, y)
fy(x,y)f(x,y+1)f(x,y1) f_y(x, y) \approx f(x, y + 1) - f(x, y -1)

Where fxf_x and fyf_y represent the gradient of ff in the xx direction and yy direction respectively. These functions are very useful if we know the values of ff surrounding a point (x,y)(x, y), but if we do not know or cannot calculate f(x1,y)f(x - 1, y) we cannot use these functions. This means our approximation method won’t work on the edges of our picture where we don’t have pixels on all sides. Instead, we use a second approximation called the Forward/Backward Difference which is defined as:

fx(x,y)3f(x,y)+4f(x+δ,y)f(x+2δ,y) f_x(x, y) \approx -3 f(x, y) + 4 f(x + \delta, y) - f(x + 2 \delta, y)
fy(x,y)3f(x,y)+4f(x,y+δ)f(x,y+2δ) f_y(x, y) \approx -3 f(x, y) + 4 f(x, y + \delta) - f(x, y + 2 \delta)

Where δ\delta can be either 1 or -1. When it is 1 we call it the Forward Difference because the points are increasing and when it is -1 we call it the Backward Difference since the points are decreasing. As you can see, this allows us to calculate the approximate value of fxf_x and fyf_y on the edges of the picture since our formula only involves points on one side of x and y. One note for pedantics is that the true value of the gradient is actually scaled and may have a different sign from what we are calculating, but since we will eventually be squaring these numbers and only care about the relative values, this ends up not impacting our calculations.

Now, we need to use these approximations to calculate the overall energy at a pixel. First, we will use R(x,y)R(x,y), G(x,y)G(x,y), and B(x,y)B(x,y) to represent the red, green, and blue values of the pixel located at (x,y)(x,y). Each one of these function also has an xx and yy gradient like our function ff above which we will denote using the same subscripts of xx and yy. We can now define the square of the total rates of change in the xx and yy direction to be:

Δx2(x,y)=Rx(x,y)2+Gx(x,y)2+Bx(x,y)2 \Delta_x^2(x, y) = R_x(x, y)^2 + G_x(x, y)^2 + B_x(x, y)^2
Δy2(x,y)=Ry(x,y)2+Gy(x,y)2+By(x,y)2 \Delta_y^2(x, y) = R_y(x, y)^2 + G_y(x, y)^2 + B_y(x, y)^2

Where RxR_x and the related functions can be approximated using the method above. Then, for the final part, we need to use the Pythagorean theorem to combine these two values to get the magnitude of the change which gives us our final formula for energy:

energy(x,y)=Δx2(x,y)+Δy2(x,y) \operatorname{energy}(x, y) = \sqrt{\Delta_x^2(x, y) + \Delta_y^2(x, y)}

As an example, consider the 3-by-4 image with RGB values (each component is an integer between 0 and 255) as shown in the table below. (This is the 3x4.png image in your seamcarving/data/ folder.)

Dual Gradient Computation

Example 1

The energy of the non-border pixel (1,2)(1, 2) is calculated using the central difference from pixels (0,2)(0, 2) and (2,2)(2, 2) for the x-gradient

Rx(1,2)=255255=0R_x(1, 2) = 255 - 255 = 0
Gx(1,2)=205203=2G_x(1, 2) = 205 - 203 = 2
Bx(1,2)=25551=204B_x(1, 2) = 255 - 51 = 204

yielding Δx2(1,2)=22+2042=41620\Delta_x^2(1, 2) = 2^2 + 204^2 = 41620;

and pixels (1,1)(1, 1) and (1,3)(1, 3) for the y-gradient

Ry(1,2)=255255=0R_y(1, 2) = 255 - 255 = 0
Gy(1,2)=255153=102G_y(1, 2) = 255 - 153 = 102
By(1,2)=153153=0B_y(1, 2) = 153 - 153 = 0

yielding Δy2(1,2)=1022=10404\Delta_y^2(1, 2) = 102^2 = 10404.

Thus, the energy of pixel (1,2)(1, 2) is 41620+10404=52024\sqrt{41620 + 10404} = \sqrt{52024}.

What is the energy of pixel (1, 1)?

The energy of pixel (1,1)(1, 1) is 2042+1032=52225\sqrt{204^2 + 103^2} = \sqrt{52225}.

Example 2

The energy of the border pixel (0,3)(0, 3) is calculated by the forward difference using pixels (0,3)(0, 3), (1,3)(1, 3), and (2,3)(2, 3) for the x-gradient since it is on the left edge of the image

Rx(0,3)=3255+4255255=0R_x(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Gx(0,3)=3255+4255255=0G_x(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Bx(0,3)=351+4153255=204B_x(0, 3) = -3 \cdot 51 + 4 \cdot 153 - 255 = 204

yielding Δx2(0,3)=2042=41616\Delta_x^2(0, 3) = 204^2 = 41616.

Similarly, for the y-gradient we will use the backward difference using pixels (0,3)(0, 3), (0,2)(0, 2), and (0,1)(0, 1) since the point is on the bottom edge of the image.

Ry(0,3)=3255+4255255=0R_y(0, 3) = -3 \cdot 255 + 4 \cdot 255 - 255 = 0
Gy(0,3)=3255+4203153=106G_y(0, 3) = -3 \cdot 255 + 4 \cdot 203 - 153 = -106
By(0,3)=351+45151=0B_y(0, 3) = -3 \cdot 51 + 4 \cdot 51 - 51 = 0

yielding Δy2(0,3)=(106)2=11236\Delta_y^2(0, 3) = (-106)^2 = 11236.

Thus, the energy of pixel (0,3)(0, 3) is 41616+11236=52852\sqrt{41616 + 11236} = \sqrt{52852}.

Summary

The energy for each pixel is given in the table below.

√52852√52641√52432
√52020√52225√52432
√52024√52024√52024
√52852√52020√51220

Runtime Requirements

Task
Make sure that the runtime of apply is in Θ(1)\Theta(1).

After you have the energy function working, you can move on to implementing seam finding.