Link Search Menu Expand Document

Image Processing

Comparing graph abstractions and algorithms for DAG shortest paths.

Table of contents

  1. Learning objectives
  2. Design
    1. Interface
    2. Reference implementation
    3. Implementation task
    4. Run, check, debug
  3. Analyze
    1. Affordance analysis
    2. Experimental analysis
  4. Critique
  5. Submission

Seam-carving is a content-aware image resizing technique where the size of an image is reduced by one pixel in height (by removing a horizontal seam) or width (by removing a vertical seam) at a time. Unlike cropping pixels from the edges or scaling the entire image, seam carving attempts to identify and preserve the most important content in an image.

Horizontal seam
A path of adjacent or diagonally-adjacent pixels from the left edge of an image to its right edge.
Vertical seam
The same as a horizontal seam but from the top edge of an image to its bottom edge.

In this project, we will compare 2 graph abstractions and 3 algorithms for seam carving. To get started, join your project team on Canvas. One team member should Fork the starter workspace to create a new workspace and Share it with other team members’ UW emails.

Learning objectives

  • Design graph abstractions and graph algorithms for image processing.
  • Analyze the affordances and running times for graph algorithms.
  • Critique the implications of automated image processing algorithms.

Design

Interface

Picture
A representation of a fixed-size digital image where the color of each pixel is an int. In image processing, pixel (x, y) refers to the pixel in column x and row y, with pixel (0, 0) at the upper-left corner and pixel (W − 1, H − 1) at the lower-right corner. Note that this is the opposite of the standard linear algebra notation where (i, j) refers to row i and column j and (0, 0) is at the lower-left corner.

Seam carving is a 3-step algorithm, each represented as a Java class or interface.

EnergyFunction
The energy function defines the importance of a given pixel in a picture. The higher the energy, the greater its importance. The video at 1:45 depicts bright, high-energy pixels around object boundaries.
SeamFinder
The seam finding algorithm will avoid high-energy pixels by searching for a horizontal seam with the minimum energy path. The horizontal seam is returned as a list of integers representing the y-coordinates. Continuing from 1:45, the video identifies a vertical seam that can be removed from the picture without noticeably affecting the content.
SeamCarver
The seam carving algorithm combines an energy function with the seam finder to identify and remove the most unnoticeable seams from an picture. This class handles vertical seams by transposing the picture before calling SeamFinder.

The focus of this project is on comparing implementations for SeamFinder. In order to solve this problem, the SeamFinder algorithm must find a shortest path from the left edge of a Picture to its right edge. To bootstrap the design, we’ve provided common graph interfaces and algorithms in the graphs package based on code presented in the lesson.

graphs.Graph<V>
Represents a directed, edge-weighted graph with a single method that returns a list of the neighbors of a given vertex. The directed Edge class provides 3 methods: access to the origin vertex from, the destination vertex to, and the edge weight.
graphs.ShortestPathSolver<V>
Computes a shortest paths tree in a directed, edge-weighted Graph. Implementations of the ShortestPathSolver interface must provide a public constructor that accepts two parameters: a graph and a start vertex. The solution method returns the list of vertices representing a shortest path from the start vertex to the given goal vertex.
graphs.ExtrinsicMinPQ<T>
Priority queue interface for DijkstraSolver.

The generic type V is used throughout the graphs package to represent the vertex data type. All vertices will be of the interface type Node, which provides a single method called neighbors that accepts a Picture and an EnergyFunction. Rather than defining all of the graph logic for every type of node in the Graph.neighbors method, this approach allows us to define specialized behavior on a per-node basis by implementing Node.neighbors.

Node
An adapter for the Graph interface that delegates responsibility for neighbors to each individual Node rather than the entire Graph.

Without the Node interface, we would need to centralize all of the logic in the Graph implementation.

public class ExampleGraph implements Graph<Node> {
    public List<Node> neighbors(Node node) {
        if (isSource(node)) {
            return sourceNeighbors(node);
        } else if (isSink(node)) {
            return sinkNeighbors(node);
        } else {
            return pixelNeighbors(node);
        }
    }

    private List<Node> sourceNeighbors(Node node) {
        ...
    }

    private List<Node> sinkNeighbors(Node node) {
        ...
    }

    private List<Node> pixelNeighbors(Node node) {
        ...
    }
}

By introducing the Node interface, the logic for the graph is divided into separate implementations. This improves encapsulation by enabling each implementation to maintain its own fields and logic separate from other implementations.

public class ExampleGraph implements Graph<Node> {
    public List<Node> neighbors(Node node) {
        return node.neighbors();
    }
}
public class Source implements Node {
    public List<Node> neighbors() {
        ...
    }
}
public class Sink implements Node {
    public List<Node> neighbors() {
        ...
    }
}
public class Pixel implements Node {
    public List<Node> neighbors() {
        ...
    }
}

Reference implementation

AdjacencyListSeamFinder(DijkstraSolver::new)
An adjacency list-like Graph representation combined with Dijkstra’s shortest paths algorithm. The AdjacencyListSeamFinder class contains a PixelGraph class that implements Graph<Node>. Most of the logic for initializing the adjacency list representation is in the PixelGraph constructor. The PixelGraph class defines 3 Node implementations, each with their own algorithms for determining how to compute neighbors.
source
The source node contains 0 incoming edges and picture.height() outgoing edges to each Pixel in the first column of the picture. The weight for each outgoing edge represents the energy of the corresponding pixel in the first column.
Pixel
Represents an (x, y) coordinate pixel in the underlying picture with directed edges to its right-up, right-middle, and right-down neighbors. Most pixels have 3 adjacent neighbors except for pixels at the boundary of the picture that only have 2 adjacent neighbors. The weight of an edge fromto represents the energy of the to pixel.
sink
The sink node contains picture.height() incoming edges and 0 outgoing edges: no neighbors.

The AdjacencyListSeamFinder.findSeam method constructs a PixelGraph, runs the shortest paths solver on the graph from the source node, and uses the path to the sink node to find a horizontal seam with the minimum energy. Since the shortest paths solver returns the entire path, the source and sink nodes need to be ignored in the final result with seam.subList(1, seam.size() - 1).

The source and sink nodes are anonymous classes that implement the Node interface since they are nodes in the graph but don’t correspond to pixels in the picture.

Anonymous classes enable you to make your code more concise. They enable you to declare and instantiate a class at the same time. They are like local classes except that they do not have a name. Use them if you need to use a local class only once.

The reference implementation decomposes the problem along the following abstraction levels.

  • Picture represents the real colors of each pixel in the picture.
  • DualGradientEnergyFunction computes the gradient energy for any pixel.
  • PixelGraph is a graph representation of the picture where gradient energies are placed on the outgoing edges of each corresponding node.
  • ShortestPathSolver computes the shortest path in the PixelGraph.

Implementation task

Design and implement an alternative graph representation and graph algorithm for solving shortest paths in a directed acyclic graph.

GenerativeSeamFinder
Rather than precomputing and storing the neighbors for every single Pixel in a 2-D array, this graph representation computes the neighbors for each pixel on demand to reduce memory usage. Identify the relevant portions of the AdjacencyListSeamFinder.PixelGraph class and modify it so that the PixelGraph.pixels and Pixel.neighbors fields are no longer needed.
graphs.ToposortDAGSolver
Computes a shortest paths tree in a directed acyclic graph using the reduction to topological sorting presented in the lesson.
  1. Initialize edgeTo and distTo data structures just as in DijkstraSolver.
  2. List all reachable vertices in depth-first search post-order. Then, Collections.reverse the list.
  3. For each node in reverse DFS post-order, apply Dijkstra’s edge relaxation step: if the distance to the neighboring node using the given edge is less than the distTo value to the neighboring node, update distTo and edgeTo accordingly.
DynamicProgrammingSeamFinder
A dynamic programming seam finding algorithm. This algorithm solves the problem by considering pixels from left to right similar to the ToposortDAGSolver except that it operates directly on the Picture rather than through an abstract Graph representation.
  1. Initialize a 2-D double[picture.width()][picture.height()] to represent the cost of the minimum energy path from the left edge to each pixel.
  2. Iteratively compute the minimum energy path to each pixel starting from the left edge of the picture and working towards the right edge by considering each pixel’s predecessors: the minimum of the left-up, left-middle, and left-down paths.
  3. Compute a shortest path by starting from the right edge and working back towards the left edge, adding the y-coordinate of each minimum-cost predecessor to a list.
  4. Collections.reverse the list to obtain the horizontal seam.
Video
Explain one part of the project implementation that you’re particularly proud of developing.

Run, check, debug

Open Terminal and enter the following command to compile and run the SeamCarver class. Use the keyboard shortcut Ctrl+C to stop program execution at any time. Once the SeamCarver class has finished execution, the rm command removes the compiled class files.

javac SeamCarver.java && java SeamCarver; rm -f *.class graphs/*.class

By default, the SeamCarver class uses the AdjacencyListSeamFinder together with the DijkstraSolver. Modify the class to test your own implementation.

For debugging, use print statements to output the state of important variables. Study this 4-minute video on debugging with print statements. Note how they spend almost all of the time diagnosing the problem and developing a thorough explanation for why the program is not working as expected. The actual bugfix only takes a few seconds of typing. This is normal practice for all programmers, though programmers who have seen similar bugs before will be able to identify and resolve problems more quickly.

To compare several implementations at once, run the provided SeamFinderMultiTest, which validates the implementations against the expected minimum energy path cost. If some of your implementations are not ready for testing, comment them out of the implementations map.

javac SeamFinderMultiTest.java && java SeamFinderMultiTest; rm -f *.class graphs/*.class

Analyze

Affordance analysis

How does the choice of energy function affect the effectiveness of the seam carving algorithm? The DualGradientEnergyFunction defines the energy of a pixel as a function of RGB (red, green, blue) color differences in adjacent horizontal and vertical pixels.

Dual-gradient energies

However, this choice of energy function is not the only option. The video at 3:04 and other blog posts discuss alternative energy functions as they relate to the quality of results.

Video
Give an affordance analysis of dual-gradient energy seam carving. Describe some data and example images that we should be particularly concerned about when evaluating any energy-based seam carving algorithm.

Experimental analysis

Run the provided SeamFinderInputSizeExperiments to compare the real-world runtime of each implementation. For each implementation, SeamFinderInputSizeExperiments constructs an empty instance and records the number of seconds that it takes to compute a horizontal seam through randomly-generated images of increasing size.

javac SeamFinderInputSizeExperiments.java && java -Xmx200m -Xss4m SeamFinderInputSizeExperiments; rm -f *.class graphs/*.class

The benchmarking script will create a folder named experiment and output a CSV file for each implementation. Open each CSV file and copy-paste the text into plotting software such as Desmos. Plot the runtimes of all 5 implementations.

Killed
When testing the largest images, the experiments may stop with a Killed message from the operating system when AdjacencyListSeamFinder(ToposortDAGSolver::new) runs out of memory. This is fine; focus on the majority of the data that was successfully computed.
Video
Compare the runtimes for finding a horizontal seam in each of the following algorithms. Are certain algorithms consistently faster than others? Compare the adjacency list representation vs. the generative representation for finding shortest paths.
  1. AdjacencyListSeamFinder(DijkstraSolver::new)
  2. AdjacencyListSeamFinder(ToposortDAGSolver::new)
  3. GenerativeSeamFinder(DijkstraSolver::new)
  4. GenerativeSeamFinder(ToposortDAGSolver::new)
  5. DynamicProgrammingSeamFinder()

Critique

Connect image processing to discussions and opinions occurring in real-world contexts. Here are a few articles and essays to begin your research.

Everybody Can Make Deepfakes Now! and This AI Makes “Audio Deepfakes”
Deepfakes are algorithm-generated images, video, and audio designed to resemble real people. “It is important that everyone knows about the fact that we can both perform joint video and audio synthesis for a target subject.”
‘For Some Reason I’m Covered in Blood’: GPT-3 Contains Disturbing Bias Against Muslims and DALL·E
“GPT-3, like all large language models trained on internet corpora, will generate stereotyped or prejudiced content.” If a GPT-3 model is then asked to generate images corresponding to text prompts (DALL·E), what kinds of machine bias may result?
Gender Shades and Research Paper
How well do facial recognition services guess the gender of a face? “We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%.”
Video
Synthesize at least one of the above articles together with a couple additional sources. Examine the positionality of the various articles by considering some of the following questions (and any of your own questions). How does the author’s lived experiences lead them to their point of view? How do the distribution of benefits and harms affect different people and society?

Submission

The final deliverable for your project is a video no longer than about 10 minutes that addresses all of the tasks labeled Video

All project team members must be present in the video, which you can do by recording it over a Zoom meeting. The presentation video must be organized around presentation slides. Upload your video to the Canvas assignment. After the submission is complete, add a submission comment with two links:

  1. Your Ed Workspace.
  2. Your presentation slides.