Introduction and Energy Calculation
Determining the importance of pixels in an image.
Table of contents
Seam Carving Algorithm
- Background
- Read through the description of the seam carving algorithm.
Seam-carving is a content-aware image resizing technique where the image is reduced in size by one pixel of height (or width) at a time. A vertical seam in an image is a path of adjacent or diagonally-adjacent pixels connected from the top of the image to the bottom, with one pixel in each row; a horizontal seam is the same from left to right. (Seams cannot wrap around the image: e.g., a vertical seam cannot cross from the leftmost column of the image to the rightmost column.) Below left is the original 505-by-287 pixel image; below right is the result after removing 150 vertical seams, resulting in a 30% narrower image. Unlike standard content-agnostic resizing techniques (such as cropping and scaling), seam carving preserves the most interest features (aspect ratio, set of objects present, etc.) of the image.
Although the underlying algorithm is simple and elegant, it was not discovered until 2007. Now, it is now a core feature in Adobe Photoshop and other computer graphics applications.
In this assignment, you will create implement an algorithm that resizes a -by- image using the seam-carving technique.
Notation
In image processing, pixel refers to the pixel in column and row , with pixel at the upper-left corner and pixel at the lower-right corner. This is consistent with the Picture data type that we use in this assignment.
A 3-by-4 image.
- Warning
- This is the opposite of the standard mathematical notation used in linear algebra, where refers to row and column and is at the lower-left corner.
We also assume that the color of each pixel is represented in RGB space, using three integers between 0 and 255. This is consistent with the Color data type.
Overview
Finding and removing a seam involves three parts:
Pre-processing: The first step is to pre-process the image to determine the importance of each pixel. For each pixel, we calculate an energy, a measure of its distinctness from its adjacent pixels; the higher the energy, the more important it is. In this assignment, you will use the dual-gradient energy function, which is described below. Here are the values of the dual-gradient energy function on the surfing image above:
The energy is high (white) for pixels in the image where there is a rapid color gradient (such as the boundary between the sea and sky and the boundary between the surfing Josh Hug on the left and the ocean behind him). The seam-carving technique will avoid removing such high-energy pixels.
Seam finding: The next step is to identify a seam of minimum total energy. The direction of the seam depends on the dimension being resized: resizing horizontally requires vertical seams, and resizing vertically requires horizontal seams. Seam finding is similar to the classic shortest path problem in a weighted directed graph, but with some key differences that will be discussed later.
Seam removal: The final step is remove from the image all of the pixels along the vertical or horizontal seam. This has already been implemented for you in the
SeamCarver
class.
Dual-Gradient Energy Function
- Task
- Implement
DualGradientEnergyFunction
using the following math.
For this assignment, we are going to define the energy of a pixel to be the magnitude of the rate of change of colors at that pixel. In other words, a high energy pixel is one that has a large change from its neighbors while a low energy pixel has very little change between itself and its neighbors. The way we are actually going to calculate this value is by approximating the derivative in the X and Y directions then combine these using the pythagorean theorem to get the overall change.
To start, we need to figure out how to calculate the derivative which in more than one dimension is called the gradient. Usually, when we find derivatives we have a continuous function to look at, but here we only have the descrete pixels making up our picture. This means we will have to use something called numerical differentiation that (luckily) some very smart mathematicians have already figured out for us. For our assignment, we will be using two formulas to calculate the gradient. The first is called the Central Difference and is defined as follows:
Where and represent the gradient of in the direction and direction respectively. These functions are very useful if we know the values of surrounding a point , but if we do not know or cannot calculate we cannot use these functions. This means our approximation method won’t work on the edges of our picture where we don’t have pixels on all sides. Instead, we use a second approximation called the Forward/Backward Difference which is defined as:
Where can be either 1 or -1. When it is 1 we call it the Forward Difference because the points are increasing and when it is -1 we call it the Backward Difference since the points are decreasing. As you can see, this allows us to calculate the approximate value of and on the edges of the picture since our formula only involves points on one side of x and y. One note for pedantics is that the true value of the gradient is actually scaled and may have a different sign from what we are calculating, but since we will eventually be squaring these numbers and only care about the relative values, this ends up not impacting our calculations.
Now, we need to use these approximations to calculate the overall energy at a pixel. First, we will use , , and to represent the red, green, and blue values of the pixel located at . Each one of these function also has an and gradient like our function above which we will denote using the same subscripts of and . We can now define the square of the total rates of change in the and direction to be:
Where and the related functions can be approximated using the method above. Then, for the final part, we need to use the Pythagorean theorem to combine these two values to get the magnitude of the change which gives us our final formula for energy:
As an example, consider the 3-by-4 image with RGB values (each component is an integer between 0 and 255) as shown in the table below. (This is the 3x4.png
image in your seamcarving/data/
folder.)
Example 1
The energy of the non-border pixel is calculated using the central difference from pixels and for the x-gradient
yielding ;
and pixels and for the y-gradient
yielding .
Thus, the energy of pixel is .
What is the energy of pixel (1, 1)?
The energy of pixel is .
Example 2
The energy of the border pixel is calculated by the forward difference using pixels , , and for the x-gradient since it is on the left edge of the image
yielding .
Similarly, for the y-gradient we will use the backward difference using pixels , , and since the point is on the bottom edge of the image.
yielding .
Thus, the energy of pixel is .
Summary
The energy for each pixel is given in the table below.
√52852 | √52641 | √52432 |
√52020 | √52225 | √52432 |
√52024 | √52024 | √52024 |
√52852 | √52020 | √51220 |
Runtime Requirements
- Task
- Make sure that the runtime of
apply
is in .
After you have the energy function working, you can move on to implementing seam finding.