CSE143X Notes for Friday, 11/1/13

I mentioned that our next programming assignment will involve String manipulation which is really a CSE142 topic, but I didn't intend the String processing to be the hard part of the assignment. So I said that I wanted to solve a short problem to remind people how basic String manipulation works.

I said that I wanted to write a method called dashes that would take a String as input and that would return a new String with dashes inserted between letters. So given the input "hello", the method would return "h-e-l-l-o".

Someone said that we could build up a temporary String that the method could return. So the basic structure will look like this:

        public static String dashes(String s) {
            String result = ??
            // fill up string
            return result;
        }
Someone said that we could loop over the characters of the string. Remember that the string has a method called length that tells you the number of characters in the string and a method called charAt that gets you the individual characters of the string:

        public static String dashes(String s) {
            String result = ??
            for (int i = 0; i < s.length(); i++) {
                // do something with s.charAt(i)
            }
            return result;
        }
Someone suggested that each time through the loop we want to add to the current result a dash and the next character, so I modifed the loop to be:

        for (int i = 0; i < s.length(); i++) {
            result = result + "-" + s.charAt(i);
        }
The problem with this is that it would add a leading dash that we don't want. And putting the dash after the call on charAt would put an extra dash afterwards. This is a classic fencepost problem that we can solve by processing the first character before the loop.

So we decided to initialize the string to the first character of the word:

        String result = s.charAt(0);
Unfortunately, this isn't going to work. You can't assign a string the value of a character. Someone suggested casting, but that doesn't work either. The usual way to do this in Java is to append the character to an empty string, because you can always concatenate a value to a string:

        String result = "" + s.charAt(0);
But because we processed the first character before the loop, we had to change the loop bounds to start at 1 instead of 0. So our final version became:

        public static String dashes(String s) {
            String result = "" + s.charAt(0);
            for (int i = 1; i < s.length(); i++) {
                result = result + "-" + s.charAt(i);
            }
            return result;
        }
I mentioned one last detail about this method. It fails in one case. Someone said that it doesn't properly handle an empty string, which is correct. We have assumed that there is a character at position 0. That won't work for an empty string. We could add a special case for that, but I said that in this case I would probably add a precondition that makes this clear:

        // pre: s is not an empty string
It's important to consider these cases and document any preconditions, but sometimes assumptions like this are reasonable. For example, in the hangman programming assignment, you will never be asked to deal with an empty string because the program itself guarantees that all words are at least of length 1.

Then I returned to the discussion of the program example that involves keeping track of friends. I said that we would slowly discuss the most important part of the program because it is similar to tasks you will be asked to perform on the programming assignment. We left off in the previous lecture by recognizing that we want to have a map that uses names as keys and that keeps track of a set of names for each person (their set of friends):

        Map<String, Set<String>> friends = new TreeMap<String, Set<String>>();
To fill up this structure, we need to process the input file. Remember that the input file has lines that have two names separated by a "--", as in:

    Ashley -- Christopher
I showed the following code to read lines of input and find the ones that contain names:

        while (input.hasNextLine()) {
            String line = input.nextLine();
            if (line.contains("--")) {
                Scanner lineData = new Scanner(line);
                String name1 = lineData.next();
                lineData.next();  // this skips the "--" token
                String name2 = lineData.next();
                // process name1 and name2
            }
        }
This was not the interesting part of the code because we saw file processing in cse142. The interesting part is to think of how to process the two names. How do we update our friends map given a new friendship? Friendships are bidirectional, so we have to be careful to add the friendship in both directions. If there is an Ashley--Christopher friendship, then we have to make sure that Ashley's set of friends includes Christopher and we have to make sure that Christopher's set of friends includes Ashley.

How do we update our friends map given a new friendship? Friendships are bidirectional, so we have to be careful to add the friendship in both directions. If there is an Ashley--Chritopher friendship, then we have to make sure that Ashley's set of friends includes Christopher and we have to make sure that Christopher's set of friends includes Ashley.

I mentioned that this is a good place to introduce an extra method because we're going to do the same thing twice. So we replaced the comment above with the following two lines of code:

        addTo(friends, name1, name2);
        addTo(friends, name2, name1);
So then we turned to the task of writing the addTo method. It takes the map and the two names as parameters, so it looks like this:

        public static void addTo(Map<String, Set<String>> friends, String name1, 
                                 String name2) {
            ...
        }
If we're trying to add name2 to the set for name1, then in general we want to:

        get the set for name1
        add name2 to that set
Here is a first attempt:

        Set<String> names = friends.get(name1);
        names.add(name2);
This is a good start. Remember that the whole point of the map is to associate a name with a set of names. So in the first line of code we ask the map to give us the set of names associated with name1. In the second line, we add to that set name2.

Although we can write the code in this way as two lines of code, most programmers would write this as one line of code. There is no need to introduce the local variable called names. So we can instead write this as:

        friends.get(name1).add(name2);
But there is a problem with this approach. It assumes that there is a set of names associated with name1. Initially the map is empty. And if we call get for a key that is not in the map, then we get the value null back. That would cause a NullPointerException if we tried to treat it as a set that we can add something to.

The very first time we see a name, we want to put it into the map. When we do that, we want to associate it with a brand new set that can be used to store the names of that person's friends:

        friends.put(name1, new TreeSet<String>());
But we only want to do this once. For example, if we did this every time we went to add a friendship for this person, then we would always have a set with just one name in it. The first time we see name1, we want to make this set. Then every other time we simply want to add a new name to the existing set. So we need to include a test that constructs the set only the first time we see name1:

        if (!friends.containsKey(name1)) {
            friends.put(name1, new TreeSet<String>());
        }
        friends.get(name1).add(name2);
This is the complete code for the addTo method. It constructs a new set each time it sees a name for the first time. And every time it executes, it adds name2 to the set for name1.

Then I wrote some code in main to print out the contents of this structure:

        for (String name : friends.keySet()) {
            System.out.println(name + "\t" + friends.get(name));
        }
which produced the following output:

        Andrew	[Christopher, Sarah]
        Ashley	[Christopher, Emily, Jessica, Joshua]
        Bart	[Lisa, Matthew]
        Christopher	[Andrew, Ashley, Jacob, Michael, Sarah]
        Emily	[Ashley, Joshua, Sarah]
        Jacob	[Christopher, Stuart]
        Jessica	[Ashley, Michael]
        JorEl	[KalEl, Zod]
        Joshua	[Ashley, Emily, Michael]
        KalEl	[JorEl]
        Kyle	[Lex, Tyler, Zod]
        Lex	[Kyle]
        Lisa	[Bart, Marge, Matthew]
        Marge	[Lisa]
        Matthew	[Bart, Lisa, Samantha]
        Michael	[Christopher, Jessica, Joshua]
        Samantha	[Matthew, Tyler]
        Sarah	[Andrew, Christopher, Emily]
        Stuart	[Jacob]
        Tyler	[Kyle, Samantha]
        Zod	[JorEl, Kyle]
I said that I would post a version of this program to the calendar called Friends1, but that I would give it a more readable format for the output using a printf command.

Then we briefly discussed an example that involves a class for storing an angle in degrees and minutes (used, for example, to talk about lattitude and longitude). We weren't able to finish that example, so I will include it in the lecture notes for Monday instead of including it here.

We didn't have time to discuss the extended version of the Friends class, but I include the lecture notes here in case anyone wants to explore how that version is written. The final version of the extended version appears as Friends2.java on the calendar.

First I demonstrated what the Friends program is supposed to do. It is supposed to use this data to find how far one person is from another. So starting with a given person, it finds that person's friends, then the friends of those friends, then the friends of the friends of the friends, and so on. It reports how far it has to go to find a connection and if it runs out of people, it simply reports that the connection couldn't be found.

For example, here is a sample execution using our data file for finding the connection between Stuart and Ashley:

        Welcome to the cse143 friend finder.
        starting name? Stuart
        target name? Ashley
        
        Starting with Stuart
            1 away: [Jacob]
            2 away: [Christopher]
            3 away: [Andrew, Ashley, Michael, Sarah]
        found at a distance of 3
It finds that Stuart has one friend (Jacob). And that friend has one friend (Christopher). And he has 4 friends, including Ashley. So the program reports that it found Ashley is 3 away.

Here is a sample execution where the connection is not found, asking for a connection between Stuart and Bart:

        Welcome to the cse143 friend finder.
        starting name? Stuart
        target name? Bart 
        
        Starting with Stuart
            1 away: [Jacob]
            2 away: [Christopher]
            3 away: [Andrew, Ashley, Michael, Sarah]
            4 away: [Emily, Jessica, Joshua]
            5 away: []
        not found
The program goes two levels farther than it did before, finding that it runs out of people when it gets 5 away from Stuart. At that point it knows that there is no connection between Stuart and Bart.

We looked at one more example that involved a fairly long chain:

        Welcome to the cse143 friend finder.
        starting name? Bart  
        target name? JorEl
        
        Starting with Bart
            1 away: [Lisa, Matthew]
            2 away: [Marge, Samantha]
            3 away: [Tyler]
            4 away: [Kyle]
            5 away: [Lex, Zod]
            6 away: [JorEl]
        found at a distance of 6
For the Friends1 program, we wrote code that constructs the friends map. The challenge then is to use it to explore friends at various distances. To solve this problem, we will end up using several sets of names. At any given time, we will be exploring a new set of friends that are at the next distance away. We we will continue searching until we either find the target name or run out of people to search. So the overall structure of the method is as follows:

        Set<String> newFriends = new TreeSet<String>();
        newFriends.add(name1);
        int distance = 0;
        while (!newFriends.contains(name2) && !newFriends.isEmpty()) {
            distance++;
            // find friends one further away
        }
Inside the loop, we want to use the current set of newFriends to find the next group of newFriends. We can do so simply by adding all of the friends of these friends to a new set and then replacing newFriends with that new set:

        Set<String> newNewFriends = new TreeSet<String>();
        for (String friend : newFriends) {
            newNewFriends.addAll(friends.get(friend));
        }
        newFriends = newNewFriends;
This provides a pretty good solution to the problem. If we throw in some statements to print out what is happening, we end up with this solution:

        Set<String> newFriends = new TreeSet<String>();
        newFriends.add(name1);
        int distance = 0;
        System.out.println();
        System.out.println("Starting with " + name1);
        while (!newFriends.contains(name2) && !newFriends.isEmpty()) {
            distance++;
            Set<String> newNewFriends = new TreeSet<String>();
            for (String friend : newFriends) {
                newNewFriends.addAll(friends.get(friend));
            }
            newFriends = newNewFriends;
            System.out.println("    " + distance + " away: " + newFriends);
        }
        if (newFriends.contains(name2)) {
            System.out.println("found at a distance of " + distance);
        } else {
            System.out.println("not found");
        }
But notice what happens when we run this version of the program:

        Welcome to the cse143 friend finder.
        starting name? Stuart
        target name? Joshua
        
        Starting with Stuart
            1 away: [Jacob]
            2 away: [Christopher, Stuart]
            3 away: [Andrew, Ashley, Jacob, Michael, Sarah]
            4 away: [Andrew, Christopher, Emily, Jessica, Joshua, Sarah, Stuart]
        found at a distance of 4
It is getting the right answer, but the intermediate answers are not correct. It indicates, for example, that Stuart is 2 away from Stuart. That's because it is including the possibility of going from Stuart to Jacob and then from Jacob back to Stuart. In a similar way, it is saying that Christopher is 2 away and Christopher is 4 away. In this case it came up with the right answer, but allowing this kind of duplication makes the program run more slowly and it leads to an infinite loop when there is no connection between people. That's because when you allow duplicates, it just keeps finding more and more friends when it looks 5 away, 6 away, 7 away, and so on.

The solution is to introduce yet another set to keep track of people who have already been explored. Then when we form a new set of friends to consider, we remove the names of people who have already been explored. And we'll have to add the new people to the set of explored people so that we won't explore them in the future. The code below includes the extra lines of code indicated in bold face:

        Set<String> oldFriends = new TreeSet<String>();
        Set<String> newFriends = new TreeSet<String>();
        newFriends.add(name1);
        int distance = 0;
        System.out.println();
        System.out.println("Starting with " + name1);
        while (!newFriends.contains(name2) && !newFriends.isEmpty()) {
            distance++;
            oldFriends.addAll(newFriends);
            Set<String> newNewFriends = new TreeSet<String>();
            for (String friend : newFriends) {
                newNewFriends.addAll(friends.get(friend));
            }
            newNewFriends.removeAll(oldFriends);
            newFriends = newNewFriends;
            System.out.println("    " + distance + " away: " + newFriends);
        }
        if (newFriends.contains(name2)) {
            System.out.println("found at a distance of " + distance);
        } else {
            System.out.println("not found");
        }
This completes the program.


Stuart Reges
Last modified: Sat Nov 2 09:13:09 PDT 2013