By the end of this homework assignment, you should be familiar/comfortable with the following concepts/constructs:
Hopefully, you will also have had some fun manipulating images, and learned some "higher-order thinking" as well.
Warning: Do not start on this assignment the night before it is due! It will take you a fair amount of time to fully absorb this writeup and the code that we provide you.
A two-dimensional image can be naturally modeled as a function from the 2D plane to the colors:
(* Point: (x, y) *) datatype Point = Pt of real * real; (* Color: 3 color channels + transparency channel *) datatype Color = RGB of real * real * real * real; (* An image maps points onto colors *) type Image = Point -> Color;
In other words, given any point in the real plane, an image can tell you what color it produces at that point. We call such a function a functional image.
A context-free grammar is a way of specifying a particular class of "languages". I put languages in quotes because you should not think of these as computer or human languages necessarily; a "language", in this formulation, is just a well-defined set of strings. A grammar for a language defines the set of strings (or "sentences") that are legal in that language. Don't worry if this is not entirely clear at the moment; we'll explain more about grammars below.
Grammars can be used to specify classes of images as well as strings. In this assignment, we'll define some data types for you to work with functional images and grammars. Your job will be to implement a few images and image generating functions, and ultimately to produce a function that generates random sentences in an image grammar.
FuncImage
:
A SML structure containing type definitions and functions for
manipulating functional images and some other geometric
primitives.BMPFile
and
BMP_FILE
: A
structure and signature for writing functional images into the
common Windows bitmap (BMP) file format. I chose this file
format because it was simple to code up, and because you will
then be able to use these images as your desktop wallpaper under
Windows (actually, Mac and Linux can use BMPs too---but those
platforms can use almost any image file format for
wallpaper).RandomInt
:
A structure for generating random integers in a specified
range.ImageGrammar
:
A structure that contains partial definitions of the grammatic
image types and functions.SimpleTable
and
SIMPLE_TABLE
: A
structure and signature for simple lookup tables.genRandomSentence
that generates a
random sentence in the image grammar. We define these terms
below.These will be specified more precisely in problems.
In order to define a language, we must first define the notion
of alphabet. An alphabet is simply a set of
tokens over which you can form strings. For example, consider the
alphabet {a, b}
(the letters 'a' and 'b'). Here are
some strings in this alphabet:
aba bbbbb a ; Single-token string ; The empty string
If we view a and b as words, for example a = "ug!" and b = "oo!", then it's easy to see that our strings are also sentences:
ug! oo! ug! ; = 'aba' oo! oo! oo! oo! oo! ; = 'bbbbb' ug! ; = 'a' ; silence
From now on, we'll use the terms "string" and "sentence" interchangeably. Now, given an alphabet, we might wish to talk about strings with particular forms. For example, we might wish to talk about:
A language, then, is a subset of the strings that can be generated over an alphabet. How can we describe a language? One way is to list all the sentences that exist in a language. That's hard for languages with an infinite number of strings, like the two languages above. Another way is to use an English description, as above, but that's not suitable for machine manipulation, or formal study. So, we need a formalism.
It's easy to imagine that some languages are very complicated to describe, and others are very simple. And, as it turns out, humans have invented a variety of formal ways to describe languages, varying in power and expressiveness. One way is to use context-free grammars.
Context-free grammars, or CFGs, are a way of describing a certain class of languages (called, surprise! context-free languages). They were discovered independently by the linguist Noam Chomsky and the computer scientists John Backus and Peter Naur. A CFG has three components:
A production rule is of the following format:
name ::= expansion1 | expansion2 | expansion3 | ... | expansionN
where name
is the name of a production, and each
expansion represents one "legal substitution" for that production.
(Read the bar symbol "|" as "or".) Each of the expansions is a
string, where each element in the string is either
Here is a simple context-free grammar:
terminals: "Hi" "Bye" "Alice" "Bob" rules: word ::= "Hi" | "Bye" name ::= "Alice" | "Bob" statement ::= word name root rule: statement
There are only four possible sentences in this language, producible by starting at the root rule and expanding them until only terminals are left. When there is more than one expansion for a production, you can choose any of them. Here are the possible expansion paths, called "derivations":
statement => word name => word "Bob" => "Hi" "Bob" statement => word name => word "Bob" => "Bye" "Bob" statement => word name => word "Alice" => "Hi" "Alice" statement => word name => word "Alice" => "Bye" "Alice"
Life gets more interesting when grammars have recursive productions, e.g.:
terminals: "(" ")" "foo" "bar" rules: atom ::= "foo" "bar" S ::= "(" L ")" | atom L ::= L S | S root rule: L
This grammar describes a language that looks very similar to one that you know... can you figure out what it is? Try generating a few sentences in this grammar.
So far, we have presented context-free grammars as generators of linear strings. However, viewed another way, you can think of a context-free grammar as a template for a set of trees. In this view, the terminals of a grammar can be viewed as leaf nodes in a tree, and productions can be viewed as internal nodes. Here is the tree corresponding to a production in the "Hi Alice" grammar we showed above:
statement / \ word name / \ "Hi" "Bob"
Here's a tree in the second grammar:
L | S / | \ "(" L ")" / \ L S / \ \ L S atom / | \ S atom "foo" | | atom "bar" | "foo"
This tree corresponds to the sentence "( foo bar foo )". The process of taking a string and turning it into a tree in some grammar is called parsing. The result of parsing is a parse tree, sometimes called the syntax tree. Parsing is a basic part of any computer language implementation; it is quite well-understood and there are tools called parser generators that can construct parsers automatically from a grammar specification.
In this assignment, we won't do any parsing, because we'll be going in the other direction. You'll be taking a grammar and producing a random tree in that grammar.
We model colors using the following datatype:
(* Color: 3 color channels + transparency channel *) datatype Color = RGB of real * real * real * real;
The first three reals in this datatype are the red, green, and blue components respectively. By convention, these values must be between 0.0 (no value for this color) and 1.0 (full value for this color). You probably know that by combining red, green, and blue light in correct proportions, you can generate the whole gamut of visible colors; for example, yellow can be generated by the color
RGB(1.0, 1.0, 0.0, 1.0) ; maximum red + maximum green
The fourth element in our color type is a little less well-known; it's the alpha channel. The alpha channel is a measure of the color's opacity: if alpha = 1.0, the color is fully opaque, and if alpha = 0.0 then the color is transparent. We'll use the alpha channel when blending images, or overlaying one image onto another. This will become clear in the examples. For convenience, we'll define the invisible color as follows:
val INVISIBLE = RGB(0.0, 0.0, 0.0, 0.0);
As previous noted, a functional image is a function that maps from the 2D plane to the colors. Any function with this signature can be viewed as an image. Here's a very simple image:
fun white(point) = RGB(1.0, 1.0, 1.0, 1.0);
This image returns the opaque white color for every point in the real plane. In other words, this is an infinite plane of whiteness. Here's another example:
fun redCircle(Pt(x, y)) = if Math.sqrt(x*x + y*y) <= 2.0 then RGB(1.0, 0.0, 0.0, 1.0) else INVISIBLE
This function returns the opaque red pixel for all values that are within 2.0 units of the origin; it returns invisible for all others. So, in other words, this function is a red circle of radius 2, centered at the origin.
Functional images are interesting, but images that you can't view are pure abstractions. In order to render a functional image viewable, we must create a concrete embodiment of that image. We need to "render" the image as an array of actual color values, so that we can display it on the screen or write it to a file.
However, functional images are infinite, and arrays are finite.
Therefore, we need to select a finite set of points in the plane
to sample. To that end, we've provided a datatype
RenderSpec
:
(* A rectangle is defined by two points (topLeft, lowerRight) *) datatype Rectangle = Rect of Point * Point; (* A rendering spec is a rectangular window, combined with an integer "resolution" for that window. The resolution is the number of pixels per real unit. For example, if the resolution is 50, then we should draw 50 evenly spaced pixels between 0.0 and 1.0, 50 evenly spaced pixels between 1.0 and 2.0, etc *) datatype RenderSpec = RendSpec of Rectangle * int;
We've also provided a function render
that will
take an image and a rendering spec, and produce a
RenderedImage
. You don't have to know anything in
particular about RenderedImage
instances, except that
you can pass them to the function writeBMP
.
Finally, the writeBMP
function accepts a
RenderedImage
and a string, and writes the rendered
image to a BMP file named by the given string. You can then view
the result in any common graphics program, like Windows Paint or
the GIMP.
Here's a complete example of image rendering:
val aSquare = FuncImage.Rect(FuncImage.Pt(~2.0, 2.0), FuncImage.Pt(2.0, ~2.0)); val myRenderSpec = FuncImage.RendSpec(aSquare, 50); val renderedRedCircle = FuncImage.render(redCircle, myRenderSpec); BMPFile.writeBMP(renderedRedCircle, "redcircle.bmp");
The file redcircle.bmp is the result of executing the above code.
If an image is a function, then how do we manipulate functions? How can we make image factories? For example, the red circle above is not very general. Perhaps we would like to write a function makes circles of various colors and sizes. To do this, we need a higher-order function:
fun makeCircle(color, radius) = fn (Pt(x, y)) => if Math.sqrt(x*x + y*y) <= radius then color else INVISIBLE;
The return value is an Image function. Applying
makeCircle
to various tuples (color, radius) will
make circles (functions) of all colors and sizes.
What if we have one of these circles, and we want it on a
non-invisible background? To do this, we need an overlay
function, which takes two images ("over" and "under") and lays
them one on top of another. Since this isn't a graphics course,
we've provided this function for you in the FuncImage
structure.
structure FuncImage = struct ... fun alphaOverlay(below:Image, above:Image):Image = ... end; (* A blue cirle and a white background. *) val blueCircle = makeCircle(FuncImage.RGB(0.0, 0.0, 1.0, 1.0), 1.0); fun white(point) = FuncImage.RGB(1.0, 1.0, 1.0, 1.0); (* Blue circle laid on a white background. *) val circleOnWhite = FuncImage.alphaOverlay(white, blueCircle);
By composing simple functions, it is easy to build up layered images of surprising complexity and texture; here's a 1024x768-sized image (ZIP) I created using about 120 lines of ML code (most of which consists of utility functions that could be reused in many different contexts; the composition of the final product is about a dozen lines of ML).
As previously noted, grammars can be viewed as specifications for the structure of trees. As it turns out, we can view images as trees---and therefore, grammars can be used to generate families of images.
A grammatic image is specified by a sentence (i.e., a tree):
datatype Sentence = Leaf of FuncImage.Image | PartitionNode of (FuncImage.Window * Sentence) list;
A leaf in the tree contains any image. This is like a "word" in an ordinary sentence. To draw a leaf, you just draw it in the area allocated to it.
Internal nodes in the tree are specified by PartitionNodes. A partition node divides up space into several rectangular windows, each of which may contain another sentence. The datatype for a window is as follows:
type PointTransform = Point -> Point; type Window = Rectangle * PointTransform;
A window specifies that one rectangular area "stands for"
another. A PointTransform
maps one point in 2D space
to another; for example, here's the "y-axis flip" point
transform:
fun flipY(Pt(x, y)) = Pt(~x, y);
In the case of a Window
value, the
PointTransform
member should map the points in the
rectangular window to some other area. In particular, we will
want to map the rectangular window to some other rectangular
window. For your convenience, we have provided the
makeRectTransform
function, which generates this kind
of "rectangle to rectangle" transform:
fun makeRectTransform(fromRect as Rect(p1, p2), toRect as Rect(q1, q2)):PointTransform = ..
sample-sentence.sml shows a sample sentence. The result of rendering this image is shown below:
We have built this image within the unit square,
(Rect(Pt(~1.0,1.0),(1.0,~1.0))
). The sentence
contains three partitions, and so does the image. The first
partition splits the image into two halves, one above the other.
Each window contains both the rectangle and a function that maps
the rectangle onto the entire unit square. To each subimage in
each partition, the square that it has been assigned appears
to be the entire square. The transform functions at each
PartitionNode
perform this mapping.
The tree corresponding to the above image is as follows (some information omitted):
partition / \ leaf partition / / \ redCircle leaf partition / / \ blueSquare leaf leaf / \ greenSquare redCircle
It is easy to see that this tree, and many like it, could be generated by a grammar. The only really tricky part is constructing the appropriate rectangular partitions and scaling transformations.
Recall that grammars have three elements: terminals, productions, and a root production name. How do we map those into our new concept of grammatic images?
The terminals in our grammar are simple Image
values. Since images are functions, and functions are not easily
named, we'd like to have names for functions. We'll store the
terminals in a table mapping from strings to images:
type TerminalTable = (string, FuncImage.Image) SimpleTable.table;
Next, we need a datatype to represent productions. A production is a name, plus a list of expansions. But what is an "expansion" in this context? An expansion cannot just be a list of images; partition nodes must divide up the space in the current image. Therefore, an expansion is either
(* A rectangle splitter is a function that splits one rectangle into several subrectangles. *) type RectangleSplitter = FuncImage.Rectangle -> FuncImage.Rectangle list; (* Any individual expansion can be either a leaf/terminal, or a rectangle splitter and a list of productions for each subrectangle. At sentence generation time, we'll choose an expansion and apply the rectangle splitter to derive a list of subrectangles. *) datatype Expansion = MakeLeaf of string | MakePartition of RectangleSplitter * (string list); (* Productions have a name and a list of expansions for that name. *) datatype Production = Prod of string * (Expansion list);
The partition node's RectangleSplitter
must return
a number of rectangles equal to the number of strings in the list
for the MakePartition
node. There's no good way to
enforce this restriction using ML's type system; we leave it to
the user.
Here is the datatype for a complete grammar:
datatype Grammar = Gram of TerminalTable * Production list * string;
The third component is the name of the root production. When we're generating sentences, we will look up the root production and begin there.
sample-grammar.sml contains an
example of a simple grammar, one that could have generated the
example image above. This grammar uses a very simple
RectangleSplitter
, one that splits a rectangle into
two equal halves each time.
You may have noticed something odd in our definition. Namely, we say a grammatic image may partition its own top-level area into subareas. However, our original definition functional image does not have any notion of top-level area: a functional image maps every point in the plane to a Color, and therefore extends from the origin out to infinity in every direction (even if, as in the case of our circle functions, most of that infinite plane is empty).
When generating a tree, therefore, we must provide both a grammar and an initial area which can be split up:
(* A "sentence generator" is a function that takes a grammar and produces a random sentence in that grammar. *) type SentenceGenerator = Grammar * FuncImage.Rectangle -> Sentence;
The sentence generator has the following behavior:
MakeLeaf
expansion, it will construct a Leaf
sentence, looking
up appropriate image in the terminal table.When a sentence generator encounters a
MakePartition
expansion, it will construct a
PartitionNode
. Recall that PartitionNodes contain
both a Window
and a subsentence; it creates these two
pieces as follows:
MakePartition
's
RectangleSplitter
to the current rectangle
parameter, yielding a list of subrectangles.PointTransform
functions that map from
each subrectangle to the top-level rectangle. For example, in
the sample image above, the PointTransform
maps
each little rectangle to the top-level unit square.This is actually the entire algorithm, in general terms. The details are left as an exercise to the reader (see below).
Provide solutions for the following problems:
In your own words, using no more than 2 or 3 English
sentences, summarize briefly what the
FuncImage.render
function does, and how it uses
the image functions passed to it.
Use the Math.cos
(cosine over reals) function
and FuncImage.alphaOverlay
to write an image
function named blueWaves
that, when rendered,
results in something like the following image (ignore the
border, that's an artifact of this web page):
(Hints: Vary the alpha, not the blue component. Also, if you can't get the gradients to look smooth, try scaling the image up or down a little bit.)
A "texture" is simply a functional image that paints something over the real plane. Any image can be viewed as a texture, but some are better suited than others. For example, here is a checkerboard texture of light red and dark red:
fun checker(FuncImage.Pt(x, y)) = let val i = Real.floor(x) + Real.floor(y) in if i mod 2 = 0 then FuncImage.RGB(1.0, 0.5, 0.5, 1.0) else FuncImage.RGB(0.6, 0.2, 0.2, 1.0) end;
For this question, write the function
makeTexturedCircle
. Instead of taking a simple
color value as its first parameter,
makeTexturedCircle
takes a texture; in other
words, its type is
Image * real -> Image;
Here's an example of the checkerboard texture above applied
to a circle using makeTexturedCircle
, then
overlaid on a white background using
alphaOverlay
:
Implement the function genRandomSentence
,
which takes a grammar and a rectangular area and produces a
random grammatic image, corresponding to this grammar, in the
rectangular area. This function therefore has the type
ImageGrammr.Grammar * FuncImage.Rectangle -> ImageGrammar.Sentence;
Here are some hints for solving parts of this problem:
Sentence
tree. This will
allow you to examine the sentences that your function is
generating.Window
for
each subrectangle, and recursively generate an
appropriate subsentence for that rectangle.ImageGrammar.genImage
function is
properly implemented. Use it to render images, to test your
function!TextIO.print
statements at strategic
points in your code.