Image Processing
Comparing graph abstractions and algorithms for DAG shortest paths.
Table of contents
Seam-carving is a content-aware image resizing technique where the size of an image is reduced by one pixel in height (by removing a horizontal seam) or width (by removing a vertical seam) at a time. Unlike cropping pixels from the edges or scaling the entire image, seam carving attempts to identify and preserve the most important content in an image.
- Horizontal seam
- A path of adjacent or diagonally-adjacent pixels from the left edge of an image to its right edge.
- Vertical seam
- The same as a horizontal seam but from the top edge of an image to its bottom edge.
In this project, we will compare 2 graph abstractions and 3 algorithms for seam carving. To get started, join your project team on Canvas. One team member should Fork
Learning objectives
- Design graph abstractions and graph algorithms for image processing.
- Analyze the affordances and running times for graph algorithms.
- Critique the implications of automated image processing algorithms.
Design
Interface
Picture
- A representation of a fixed-size digital image where the color of each pixel is an
int
. In image processing, pixel (x, y) refers to the pixel in column x and row y, with pixel (0, 0) at the upper-left corner and pixel (W − 1, H − 1) at the lower-right corner. Note that this is the opposite of the standard linear algebra notation where (i, j) refers to row i and column j and (0, 0) is at the lower-left corner.
Seam carving is a 3-step algorithm, each represented as a Java class
or interface
.
EnergyFunction
- The energy function defines the importance of a given pixel in a picture. The higher the energy, the greater its importance. The video at 1:45 depicts bright, high-energy pixels around object boundaries.
SeamFinder
- The seam finding algorithm will avoid high-energy pixels by searching for a horizontal seam with the minimum energy path. The horizontal seam is returned as a list of integers representing the y-coordinates. Continuing from 1:45, the video identifies a vertical seam that can be removed from the picture without noticeably affecting the content.
SeamCarver
- The seam carving algorithm combines an energy function with the seam finder to identify and remove the most unnoticeable seams from an picture. This class handles vertical seams by transposing the picture before calling
SeamFinder
.
The focus of this project is on comparing implementations for SeamFinder
. In order to solve this problem, the SeamFinder
algorithm must find a shortest path from the left edge of a Picture
to its right edge. To bootstrap the design, we’ve provided common graph interfaces and algorithms in the graphs
package based on code presented in the lesson.
graphs.Graph<V>
- Represents a directed, edge-weighted graph with a single method that returns a list of the
neighbors
of a given vertex. The directedEdge
class provides 3 methods: access to the origin vertexfrom
, the destination vertexto
, and the edgeweight
. graphs.ShortestPathSolver<V>
- Computes a shortest paths tree in a directed, edge-weighted
Graph
. Implementations of theShortestPathSolver
interface must provide a public constructor that accepts two parameters: a graph and a start vertex. Thesolution
method returns the list of vertices representing a shortest path from the start vertex to the givengoal
vertex. graphs.ExtrinsicMinPQ<T>
- Priority queue interface for
DijkstraSolver
.
The generic type V
is used throughout the graphs
package to represent the vertex data type. All vertices will be of the interface type Node
, which provides a single method called neighbors
that accepts a Picture
and an EnergyFunction
. Rather than defining all of the graph logic for every type of node in the Graph.neighbors
method, this approach allows us to define specialized behavior on a per-node basis by implementing Node.neighbors
.
Node
- An adapter for the
Graph
interface that delegates responsibility forneighbors
to each individualNode
rather than the entireGraph
.
Without the Node
interface, we would need to centralize all of the logic in the Graph
implementation.
public class ExampleGraph implements Graph<Node> {
public List<Node> neighbors(Node node) {
if (isSource(node)) {
return sourceNeighbors(node);
} else if (isSink(node)) {
return sinkNeighbors(node);
} else {
return pixelNeighbors(node);
}
}
private List<Node> sourceNeighbors(Node node) {
...
}
private List<Node> sinkNeighbors(Node node) {
...
}
private List<Node> pixelNeighbors(Node node) {
...
}
}
By introducing the Node
interface, the logic for the graph is divided into separate implementations. This improves encapsulation by enabling each implementation to maintain its own fields and logic separate from other implementations.
public class ExampleGraph implements Graph<Node> {
public List<Node> neighbors(Node node) {
return node.neighbors();
}
}
public class Source implements Node {
public List<Node> neighbors() {
...
}
}
public class Sink implements Node {
public List<Node> neighbors() {
...
}
}
public class Pixel implements Node {
public List<Node> neighbors() {
...
}
}
Reference implementation
AdjacencyListSeamFinder(DijkstraSolver::new)
- An adjacency list-like
Graph
representation combined with Dijkstra’s shortest paths algorithm. TheAdjacencyListSeamFinder
class contains aPixelGraph
class that implementsGraph<Node>
. Most of the logic for initializing the adjacency list representation is in thePixelGraph
constructor. ThePixelGraph
class defines 3Node
implementations, each with their own algorithms for determining how to computeneighbors
.source
- The source node contains 0 incoming edges and
picture.height()
outgoing edges to eachPixel
in the first column of the picture. The weight for each outgoing edge represents the energy of the corresponding pixel in the first column.
Pixel
- Represents an (x, y) coordinate pixel in the underlying picture with directed edges to its right-up, right-middle, and right-down neighbors. Most pixels have 3 adjacent neighbors except for pixels at the boundary of the picture that only have 2 adjacent neighbors. The weight of an edge
from
—to
represents the energy of theto
pixel.
sink
- The sink node contains
picture.height()
incoming edges and 0 outgoing edges: no neighbors.
The AdjacencyListSeamFinder.findSeam
method constructs a PixelGraph
, runs the shortest paths solver on the graph from the source node, and uses the path to the sink node to find a horizontal seam with the minimum energy. Since the shortest paths solver returns the entire path, the source and sink nodes need to be ignored in the final result
with seam.subList(1, seam.size() - 1)
.
The source and sink nodes are anonymous classes that implement the Node
interface since they are nodes in the graph but don’t correspond to pixels in the picture.
Anonymous classes enable you to make your code more concise. They enable you to declare and instantiate a class at the same time. They are like local classes except that they do not have a name. Use them if you need to use a local class only once.
The reference implementation decomposes the problem along the following abstraction levels.
Picture
represents the real colors of each pixel in the picture.DualGradientEnergyFunction
computes the gradient energy for any pixel.PixelGraph
is a graph representation of the picture where gradient energies are placed on the outgoing edges of each corresponding node.ShortestPathSolver
computes the shortest path in thePixelGraph
.
Implementation task
Design and implement an alternative graph representation and graph algorithm for solving shortest paths in a directed acyclic graph.
GenerativeSeamFinder
- Rather than precomputing and storing the neighbors for every single
Pixel
in a 2-D array, this graph representation computes theneighbors
for each pixel on demand to reduce memory usage. Identify the relevant portions of theAdjacencyListSeamFinder.PixelGraph
class and modify it so that thePixelGraph.pixels
andPixel.neighbors
fields are no longer needed. graphs.ToposortDAGSolver
- Computes a shortest paths tree in a directed acyclic graph using the reduction to topological sorting presented in the lesson.
- Initialize
edgeTo
anddistTo
data structures just as inDijkstraSolver
. - List all reachable vertices in depth-first search post-order. Then,
Collections.reverse
the list. - For each node in reverse DFS post-order, apply Dijkstra’s edge relaxation step: if the distance to the neighboring node using the given edge is less than the
distTo
value to the neighboring node, updatedistTo
andedgeTo
accordingly.
DynamicProgrammingSeamFinder
- A dynamic programming seam finding algorithm. This algorithm solves the problem by considering pixels from left to right similar to the
ToposortDAGSolver
except that it operates directly on thePicture
rather than through an abstractGraph
representation.- Initialize a 2-D
double[picture.width()][picture.height()]
to represent the cost of the minimum energy path from the left edge to each pixel. - Iteratively compute the minimum energy path to each pixel starting from the left edge of the picture and working towards the right edge by considering each pixel’s predecessors: the minimum of the left-up, left-middle, and left-down paths.
- Compute a shortest path by starting from the right edge and working back towards the left edge, adding the y-coordinate of each minimum-cost predecessor to a list.
Collections.reverse
the list to obtain the horizontal seam.
- Video
- Explain one part of the project implementation that you’re particularly proud of developing.
Run, check, debug
Open Terminal SeamCarver
class. Use the keyboard shortcut Ctrl+C to stop program execution at any time. Once the SeamCarver
class has finished execution, the rm
command removes the compiled class files.
javac SeamCarver.java && java SeamCarver; rm -f *.class graphs/*.class
By default, the SeamCarver
class uses the AdjacencyListSeamFinder
together with the DijkstraSolver
. Modify the class to test your own implementation.
For debugging, use print statements to output the state of important variables. Study this 4-minute video on debugging with print statements. Note how they spend almost all of the time diagnosing the problem and developing a thorough explanation for why the program is not working as expected. The actual bugfix only takes a few seconds of typing. This is normal practice for all programmers, though programmers who have seen similar bugs before will be able to identify and resolve problems more quickly.
To compare several implementations at once, run the provided SeamFinderMultiTest
, which validates the implementations against the expected minimum energy path cost. If some of your implementations are not ready for testing, comment them out of the implementations
map.
javac SeamFinderMultiTest.java && java SeamFinderMultiTest; rm -f *.class graphs/*.class
Analyze
Affordance analysis
How does the choice of energy function affect the effectiveness of the seam carving algorithm? The DualGradientEnergyFunction
defines the energy of a pixel as a function of RGB (red, green, blue) color differences in adjacent horizontal and vertical pixels.
However, this choice of energy function is not the only option. The video at 3:04 and other blog posts discuss alternative energy functions as they relate to the quality of results.
- Video
- Give an affordance analysis of dual-gradient energy seam carving. Describe some data and example images that we should be particularly concerned about when evaluating any energy-based seam carving algorithm.
Experimental analysis
Run the provided SeamFinderInputSizeExperiments
to compare the real-world runtime of each implementation. For each implementation, SeamFinderInputSizeExperiments
constructs an empty instance and records the number of seconds that it takes to compute a horizontal seam through randomly-generated images of increasing size.
javac SeamFinderInputSizeExperiments.java && java -Xmx200m -Xss4m SeamFinderInputSizeExperiments; rm -f *.class graphs/*.class
The benchmarking script will create a folder named experiment
and output a CSV file for each implementation. Open each CSV file and copy-paste the text into plotting software such as Desmos. Plot the runtimes of all 5 implementations.
Killed
- When testing the largest images, the experiments may stop with a
Killed
message from the operating system whenAdjacencyListSeamFinder(ToposortDAGSolver::new)
runs out of memory. This is fine; focus on the majority of the data that was successfully computed. - Video
- Compare the runtimes for finding a horizontal seam in each of the following algorithms. Are certain algorithms consistently faster than others? Compare the adjacency list representation vs. the generative representation for finding shortest paths.
AdjacencyListSeamFinder(DijkstraSolver::new)
AdjacencyListSeamFinder(ToposortDAGSolver::new)
GenerativeSeamFinder(DijkstraSolver::new)
GenerativeSeamFinder(ToposortDAGSolver::new)
DynamicProgrammingSeamFinder()
Critique
Connect image processing to discussions and opinions occurring in real-world contexts. Here are a few articles and essays to begin your research.
- Everybody Can Make Deepfakes Now! and This AI Makes “Audio Deepfakes”
- Deepfakes are algorithm-generated images, video, and audio designed to resemble real people. “It is important that everyone knows about the fact that we can both perform joint video and audio synthesis for a target subject.”
- ‘For Some Reason I’m Covered in Blood’: GPT-3 Contains Disturbing Bias Against Muslims and DALL·E
- “GPT-3, like all large language models trained on internet corpora, will generate stereotyped or prejudiced content.” If a GPT-3 model is then asked to generate images corresponding to text prompts (DALL·E), what kinds of machine bias may result?
- Gender Shades and Research Paper
- How well do facial recognition services guess the gender of a face? “We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%.”
- Video
- Synthesize at least one of the above articles together with a couple additional sources. Examine the positionality of the various articles by considering some of the following questions (and any of your own questions). How does the author’s lived experiences lead them to their point of view? How do the distribution of benefits and harms affect different people and society?
Submission
The final deliverable for your project is a video no longer than about 10 minutes that addresses all of the tasks labeled Video
All project team members must be present in the video, which you can do by recording it over a Zoom meeting. The presentation video must be organized around presentation slides. Upload your video to the Canvas assignment. After the submission is complete, add a submission comment with two links:
- Your Ed Workspace.
- Your presentation slides.