For example, suppose, we want an ArrayList of Strings. We describe the type as:
ArrayList<String>When we construct an ArrayIntList, we say:
ArrayIntList lst = new ArrayIntList();Imagine replacing both occurrences of "ArrayIntList" with "ArrayList<String>" and you'll see how to construct an ArrayList<String>:
ArrayList<String> lst = new ArrayList<String>();And in the same way that you would declare a method header for manipulating an ArrayIntList object:
public void doSomethingCool(ArrayIntList lst) { ... }You can use ArrayList<String> in place of ArrayIntList to declare a method that takes an ArrayList<String> as a parameter:
public void doSomethingCool(ArrayList<String> list) { ... }It can even be used as a return type if you want to have the method return an ArrayList:
public ArrayList<String> doSomethingCool(ArrayList<String> list) { ... }Once you have declared an ArrayList<String>, you can use manipulate it with the kinds of calls we have made on our ArrayIntList but using Strings instead of ints:
ArrayList<String> list = new ArrayList<String>(); list.add("hello"); list.add("there"); list.add(0, "fun"); System.out.println(list);which produces this output:
[fun, hello, there]All of the methods we have seen with ArrayIntList are defined for ArrayList: the appending add, add at an index, remove, size, get, etc. So we could write the following loop to print each String from an ArrayList<String>:
for (int i = 0; i < lst.size(); i++) { System.out.println(lst.get(i)); }I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.
Java 5 also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:
ArrayList<Integer> list = new ArrayList<Integer>(); list.add(18); list.add(34);In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay because Java will automatically "box" the ints for us (i.e., wrap them up in Integer objects). We can also refer to elements of this list and treat them as simple ints, as in:
int product = list.get(0) * list.get(1);The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.
Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.
Then I mentioned that I hoped people are aware of the array initializer syntax where you can use curly braces to specify a set of values to use for initializing an array:
int[] data = {8, 27, 93, 4, 5, 15, 206};This is a great way to define data to use for a testing program. I asked people how we'd find the product of this list and people suggested the standard approach that uses an int to index the array:
int product = 1; for (int i = 0; i < data.length; i++) { product *= data[i]; }This approach works, but there is a simpler way to do this. If all you want to do is to iterate over the values of an array one at a time, you can use what is called a for-each loop:
int product = 1; for (int n : data) { product *= n; }We generally read the for loop header as, "For each int n in data...". The choice of "n" is arbitrary. It defines a local variable for the loop. I could just as easily have called it "x" or "foo" or "value". Notice that in the for-each loop, I don't have to use any bracket notation. Instead, each time through the loop Java sets the variable n to the next value from the array. I also don't need to test for the length of the array. Java does that for you when you use a for-each loop.
There are some limitations of for-each loops. You can't use them to change the contents of the list. If you assign a value the variable n, you are just changing a local variable inside the loop. It has no effect on the array itself.
As with arrays, we can use a for-each loop for ArrayLists, so we could say:
String[] data2 = {"four", "score", "and", "seven", "years", "ago"}; ArrayList<String> lst = new ArrayList<String>(); for (String s : data2) { lst.add(s); } System.out.println(lst);which produces this output:
[four, score, and, seven, years, ago]Then I switched to talking about grammars. We are going to use an approach to describing grammars that is known as a "production system". It is well known to those who study formal linguistics. Computer scientists know a lot about them because we design our own languages like Java. This particular style of production is known as BNF (short for Backus-Naur Form). Each production describes the rules for a particular nonterminal symbol. The nonterminal appears first followed by the symbol "::=" which is usually read as "is composed of". On the right-hand side of the "::=" we have a series of rules separated by the vertical bar character which we read as "or". The idea is that the nonterminal symbol can be replaced by any of the sequences of symbols appearing between vertical bar characters.
We can describe the basic structure of an English sentence as follows:
<s> ::= <np> <vp>We would read this as, "A sentence (<s>) is composed of a noun phrase (<np>) followed by a verb phrase (<vp>)." The symbols <s>, <np> and <vp> are known as "nonterminals" in the grammar. That means that we don't expect them to appear in the actual sentences that we form from the grammar.
I pointed out that you can draw a diagram of how to derive a sentence from the BNF grammar. Wikipedia has an example of this under the entry for parse tree.
Then we "drilled down" a bit into what a noun phrase might look like. I suggested that the simplest form of noun phrase would be a proper noun, which I expressed this way:
<np> ::= <pn>So then I asked people for examples of proper nouns and we ended up with this rule:
<pn> ::= Hillary | Barack | Colbert | Ron Paul | Space Needle | Big Bird | AlfNotice that the vertical bar character is being used to separate different possibilities. In other words, we're saying that "a proper noun is either Hillary or Barack or Colbert or Ron Paul..." These values on the right-hand side are examples of "terminals". In other words, we expect these to be part of the actual sentences that are formed.
I pointed out that it is important to realize that the input is "tokenized" using white space. For example, the text "Ron Paul" is broken up into two separate tokens. So it's not a single terminal, it's actually two different terminals.
At this point I mentioned the fact that we're going to use a slight variation of BNF notation. To keep things simple, we'll use just a simple colon in place of the "::=" in the rules above. So our three rules became:
<s>: <np> <vp> <np>: <pn> <pn>: Hillary | Barack | Colbert | Ron Paul | Space Needle | Big Bird | AlfI saved this file and ran the program. It read the file and began by saying:
Available symbols to generate are: [<np>, <pn>, <s>] What do you want generated (return to quit)?I pointed out that we are defining a nonterminal to be any symbol that appears to the left of a colon in one of our productions. The input file has three productions and that is why the program is showing three nonterminals that can be generated by the grammar. I began by asking for it to generate 5 of the "<pn>" nonterminal symbol and got something like this:
Hillary Alf Barack Barack Big BirdIn this case, it is simply choosing at random among the various choices for a proper noun. Then I asked it for five of the "<s>" nonterminal symbol and got something like this:
Ron Paul <vp> Big Bird <vp> Space Needle <vp> Ron Paul <vp> Barack <vp>In this case, it is generating 5 random sentences that involve choosing 5 random proper nouns. So far the program isn't doing anything very interesting, but it's good to understand the basics of how it works.
I also pointed out that these are not proper sentences because they contain the nonterminal symbol <vp>. That's because we never finished our grammar. We haven't yet defined what a verb phrase looks like. Notice that the program doesn't care about whether or not something is enclosed in the less-than and greater-than characters, as in "<vp>". That's a convention that is often followed in describing grammar, but that's not how our program is distinguishing between terminals and nonterminals. As mentioned earlier, anything that appears to the left of a colon is considered a nonterminal and every other token is considered a terminal.
Then I said that there are other kinds of noun phrases than just proper nouns. We might use a word like "the" or "a" followed by a noun. I asked what those words are called and someone said they are articles. So we added a new rule to the grammar:
<article>: a | theUsing this, we changed our rule for <np>:
<np>: <pn> | <article> <n>Notice how the vertical bar character is used to indicate that a noun phrase is either a proper noun or it's an article followed by a noun. This required the addition of a new rule for nouns and I again asked for suggestions from the audience:
<n>: scooter | banana | chipmunk | doritos | brick road | skirt | lightsaberAt this point the overall grammar looked like this:
<s>: <np> <vp> <np>: <pn> | <article> <n> <pn>: Hillary | Barack | Colbert | Ron Paul | Space Needle | Big Bird | Alf <article>: a | the <n>: scooter | banana | chipmunk | doritos | brick road | skirt | lightsaberWe saved the file and ran the program again. Because there are five rules in the grammar, it offered five nonterminals to choose from:
Available symbols to generate are: [<article>, <n>, <np>, <pn>, <s>] What do you want generated (return to quit)?Notice that the nonterminals are in alphabetical order, not in the order in which they appear in the file. That's because they are stored as the keys of a SortedMap that keeps the keys in sorted order.
We asked the program to generate 5 <np> and we got something like this:
the banana the lightsaber Colbert the doritos HillaryIn this case, it is randomly choosing between the "proper noun" rule and the other rule that involves an article and a noun. It is also then filling in the noun or proper noun to form a string of all terminal symbols. I also asked for five of the nonterminal symbol <s> and got something like this:
a chipmunk <vp> a brick road <vp> Barack <vp> the brick road <vp> the banana <vp>This is getting better, but we obviously need to include something for verb phrases. We discussed the difference between transitive verbs that take an object (a noun phrase) and intransitive verbs that don't. This led us to add the following new rules:
<vp>: <tv> <np> | <iv> <tv>: hit | stole | ate | kissed | took | squeezed | slapped | spoke | impounded | smoked <iv>: laughed | ran | conceded | thrusted | lost | won | fellWe saved the file and ran the program again and each of these three showed up as choices to generate:
Available symbols to generate are: [<article>, <iv>, <n>, <np>, <pn>, <s>, <tv>, <vp>] What do you want generated (return to quit)?Now when we asked for 10 sentences (10 of the nonterminal <s>), we got more interesting results like these:
a doritos fell Colbert smoked a chipmunk Alf slapped a lightsaber Space Needle laughed a skirt stole a chipmunk Colbert kissed the lightsaber Space Needle laughed Alf squeezed Hillary Alf spoke a skirt Big Bird slapped the chipmunkThen we decided to spice up the grammar a bit by adding adjectives. We added a new rule for individual adjectives:
<adj>: spectacular | large | repulsive | tall | gangly | smothered | shiny | tastyThen we talked about how to modify our rule for noun phrases. We kept our old combination of an article and a noun, but added a new one for an article and a noun with an adjective in the middle:
<np>: <pn> | <article> <n> | <article> <adj> <n>But you might want to have more than one adjective. So we introduced a new nonterminal for an adjective phrase:
<np>: <pn> | <article> <n> | <article> <adjp> <n>Then we just had to write a production for <adjp>. We want to allow one adjective or two or three, so we could say:
<adjp>: <adj> | <adj> <adj> | <adj> <adj> <adj>This is tedious and it doesn't allow four adjectives or five or six. This is a good place to use recursion:
<adjp>: <adj> | <adj> <adjp>We are saying that in the simple case or base case, you have one adjective. Otherwise it is an adjective followed by an adjective phrase. This recursive definition is simple, but it allows you to include as many adjectives as you want.
When we ran the program again, we started by asking for 5 adjective phrases and got a result like this:
gangly tall gangly large shiny gangly spectacular repulsive repulsive smothered tastyNotice that sometimes we get just one adjective ("large") and sometimes we get several because it chooses randomly between the two different rules we introduced for adjective phrase.
This produced even more interesting sentences, as in the following 10:
Alf slapped a scooter a gangly large lightsaber smoked Hillary Hillary laughed the gangly shiny chipmunk impounded Space Needle a chipmunk kissed the large tall lightsaber a banana ate a shiny tasty skirt Big Bird took Colbert a smothered repulsive tasty brick road smoked Hillary Ron Paul smoked Ron Paul a banana lostWe made one more change to the grammar to include adverbs and ended up with this final version of the grammar:
<s>: <np> <vp> <np>: <pn> | <article> <n> | <article> <adjp> <n> <adj>: spectacular | large | repulsive | tall | gangly | smothered | shiny | tasty <adjp>: <adj> | <adj> <adjp> <pn>: Hillary | Barack | Colbert | Ron Paul | Space Needle | Big Bird | Alf <article>: a | the <n>: scooter | banana | chipmunk | doritos | brick road | skirt | lightsaber <vp>: <tv> <np> | <iv> | <advp> <tv> <np> | <advp> <iv> <tv>: hit | stole | ate | kissed | took | squeezed | slapped | spoke | impounded | smoked <iv>: laughed | ran | conceded | thrusted | lost | won | fell <advp>: <adv> | <adv> <advp> <adv>: quickly | sweetly | recursively | stupidly | awesomely | seductivelyBelow are 25 sample sentences generated by the grammar:
a shiny scooter stole Space Needle Big Bird ran a banana awesomely hit a scooter the scooter recursively laughed a doritos awesomely laughed the shiny chipmunk stole a repulsive doritos a repulsive large spectacular banana thrusted the skirt seductively thrusted the tasty doritos awesomely squeezed Ron Paul the shiny large shiny banana hit a shiny tasty shiny repulsive repulsive brick road a tall chipmunk kissed Big Bird a tall brick road spoke Space Needle Big Bird took a chipmunk a banana awesomely conceded Hillary awesomely stupidly stupidly seductively hit a smothered repulsive banana a brick road lost a banana recursively recursively won a lightsaber seductively ate a scooter a lightsaber ran the repulsive skirt sweetly lost the scooter hit Alf the spectacular repulsive lightsaber laughed a chipmunk awesomely kissed the brick road Barack recursively won the tall repulsive chipmunk slapped the chipmunk