common/conf common/conf CSE 142 Lab 6: File Processing

University of Washington, CSE 142

Lab 6: File Processing

Except where otherwise noted, the contents of this document are Copyright 2013 Stuart Reges and Marty Stepp.

lab document created by Marty Stepp, Stuart Reges and Whitaker Brand

Basic lab instructions

Today's lab

Goals for today:

Recap: String Equality

Recall that although we use == to compare the values of primitive types, such as ints, doubles, and chars. However, == has a different meaning for Strings: it evaluates to true when one String is set to equal another (ex. String a = b). To see whether two Strings have the same value, we instead use .equals().

Enter true or false for the following:

String a = "hello";
String h = "h";
String b = h + "ello";

// a == b?         

false

String a = "hello";
String h = "h";
String b = h + "ello";

// a.equals(b)?

true

String a = "world";
String b = a;

// a == b?         

true

String a = "world";
String b = a;

// a.equals(b)?

true

Files

So far, we've used Scanners to read user input. Now, we're going to look at using Scanners to read over files!

A File object allows you to interact with actual files on your computer. Make sure to put any file you want to read with a Scanner in the same folder as the program that wants to use it!

    // necessary to use Files
    import java.io.*;

    // makes a File object, holding the same info as the file named "actualNameOfFile"
    File fileVariableName = new File("actualNameOfFile");

    // a Scanner that reads over the File!
    Scanner fileScanner = new Scanner(fileVariableName);
Method name Parameters Description
exists() a String file name returns true if a file w/ the given file name exists in the same folder.

Exercise : File Scanner declaration syntax

Which of the following choices is the correct syntax for declaring a Scanner to read the file example.txt in the current directory?

Scanner methods

Method name Description
next() reads and returns the next token as a String
nextLine() reads and returns as a String all the characters up to the next new line (\n)
nextInt() reads and returns the next token as an int, if possible
nextDouble() reads and returns the next token as double, if possible
hasNext() returns true if there is still a token in the Scanner
hasNextLine() returns true if there is still at least one line left to be read in the Scanner
hasNextInt() returns true if the next token can be read as an int
hasNextDouble() returns true if the next token can be read as an double

Recap: Scanner method examples

Suppose we have a Scanner trying to read over the following:
    4.2 abc 4
The following methods would read this as:
Method What happened? What's going on?
nextInt() java.util.InputMismatchException tried to read next token 4.2, couldn't process it as an int.
nextDouble() returns 4.2 as a double. tried to read next token, succeeded because it could be read as a double.
next() returned "4.2" as a String. read the next word as a String.
nextLine() returns "4.2 abc 4" as a String. read the whole next line as a String.

Exercise -A: Tokenizing

How many tokens are in the following String? 3

welcome...to the matrix.

What are the tokens that the String breaks up into?

Exercise -B: More tokenizing

How many tokens are in the following String? 9

in fourteen-hundred 92
columbus sailed the ocean blue :)

What are the tokens that the String breaks up into?

Exercise -A: Scanner practice

The next couple problems are about a file called readme.txt that has the following contents:

6.7  This file has
several input
LINES!

   10 20

What would be the output from the following code, as it would appear on the console?

Scanner input = new Scanner(new File("readme.txt"));
System.out.println(input.next());  // 6.7
System.out.println(input.next());  // This
System.out.println(input.next());  // file

Exercise -B: Scanner practice

Input file: readme.txt

6.7  This file has
several input
LINES!

   10 20

What would be the output for the following code? If there would be an error, write error .

Note: these problems are all independent. This input Scanner is not the same as the one on the previous slide!
Scanner input = new Scanner(new File("readme.txt"));
System.out.println(input.nextDouble());  // 6.7
System.out.println(input.nextDouble());  // error

Exercise -C: Scanner practice

Input file: readme.txt

6.7  This file has
several input
LINES!

   10 20

What would be the output for the following code? If there would be an error, write error .

Scanner input = new Scanner(new File("readme.txt"));
while (!input.hasNextInt()) {
    input.next();
}
System.out.println(input.nextInt());  // 10

Exercise -D: Scanner practice

Input file: readme.txt

6.7  This file has
several input
LINES!

   10 20

What would be the output for the following code? If there would be an error, write error .

Scanner input = new Scanner(new File("readme.txt"));
System.out.println(input.nextLine());  // 6.7  This file has
System.out.println(input.nextLine());  // several input
System.out.println(input.nextLine());  // LINES!

New methods: Scanner method examples

Suppose we have a Scanner trying to read over the following:
    4.2 abc 4
The following methods would read this as:
Method Returned Why?
hasNextInt() false the next token is a double, and thus cannot be read as an integer.
hasNextDouble() true the next token is a double!
hasNext() true the next token can be read as a String!
hasNextLine() true there exists a line of input!

Exercise : .hasNext() caution

.hasNext() methods only check the type of the next token. .next() methods consume tokens, changing what token the Scanner is looking at next! For each of the following, enter the output if the loop terminates, or write infinite if the loop loops forever!

Consider the following as part of input.txt

Jello world :)
Scanner input = new Scanner(new File("input.txt"));
while (input.hasNext()) {
   String nextWord = input.next();
   System.out.print(nextWord + " ");
}
Jello world :) 
Scanner input = new Scanner(new File("input.txt"));
while (input.hasNext()) {
   System.out.println("hi");
}
infinite

Exercise : flipLines practice-it

Write a method named flipLines that accepts a Scanner holding an input file and writes to the console the same file's contents with each pair of lines reversed in order. If the input file has an odd number of lines, the last line should be printed in its original position! For example, if the file contains:

Twas brillig and the slithy toves
did gyre and gimble in the wabe.
All mimsey were the borogroves,
and the mome raths outgrabe.

The End
Your method should produce the following output:
did gyre and gimble in the wabe.
Twas brillig and the slithy toves
and the mome raths outgrabe.
All mimsey were the borogroves,
The End

File Exceptions

When you work with Files, the Java compiler gets concerned that you might be trying to access a File that doesn't exist. So any method that uses Files must declare that it might throw a FileNotFoundException: this basically tells Java that if the desired "actualNameOfFile" cannot be found, it's okay to crash.

Every method that directly or indirectly calls a method that works with Files also needs to declare that it might throw a FileNotFoundException.

  public static void main(String[] args) throws FileNotFoundException {
                           ...
  }

Exercise : Words

Exercise - answer

import java.io.*;     // for File
import java.util.*;   // for Scanner

public class Words {
    public static void main(String[] args) throws FileNotFoundException {
        int wordCount = 0;
        Scanner input = new Scanner(new File("wordinput.txt"));
        
        // your code goes here ...
        while (input.hasNext()) {
            String word = input.next();
            wordCount++;
        }
        
        System.out.println("Total words = " + wordCount);
    }
}

.next() vs .nextLine()

String Scanners

Consider the following problem:

Read over an input File, print out the # of times the word "the" shows up on each line.

We already know how to read full lines or individual words from a Scanner. Now we need a new strategy: we need to read lines from the input, and then we need to somehow read word-by-word through each line.

We can do this by creating a second Scanner! So far, we've used Scanners to read user input and to read Files. Scanners can also read over Strings!

    Scanner fileScannerName = new Scanner(new File("fileName"))
    while (fileScannerName.hasNextLine()) {
       String line = fileScannerName.nextLine(); // reads the next line from the input file
       Scanner lineScanner = new Scanner(line);
       while (lineScanner.hasNext()) {
          String word = lineScanner.next(); // reads the next word from the input line
          ...
       }
    }

Exercise : coinFlip practice-it

Write a method named coinFlip that accepts a Scanner for an input file of coin flips that are heads (H) or tails (T). Consider each line to be a separate set of coin flips and output the number and percentage of heads in that line. If it is more than 50%, print "You win!". Consider the following file:

H T H H T
T t    t  T h  H
      h

For the input above, your method should produce the following output:

3 heads (60.0%)
You win!

2 heads (33.3%)

1 heads (100.0%)
You win!
Note: to print a "%" using printf, say System.out.printf("%%");

Exercise : countWords errors practice-it

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Counts the total lines and words in the given input scanner.
public static void countWords(Scanner input) {
    Scanner input = new Scanner(new File("example.txt"));
    int lineCount = 0;
    int wordCount = 0;
    
    while (input.nextLine()) {
        String line = input.line();       // read one line
        lineCount++;
        while (line.next()) {             // count tokens in line
            String word = lineScan.hasNext;
            wordCount++;
        }
    }
}

The above attempted solution to Practice-It problem "countWords" has a few errors. Open Practice-It, copy/paste this code into it, and fix the errors. Complete the code so that it passes the test cases.

Exercise - solution

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// Counts the total lines and words in the given input scanner.
public static void countWords(Scanner input) {
    Scanner input = new Scanner(new File("example.txt"));
    int lineCount = 0;
    int wordCount = 0;
    
    while (input.hasNextLine()) {
        String line = input.nextLine();   // read one line
        lineCount++;
        Scanner lineScan = new Scanner(line);
        while (lineScan.hasNext()) {      // count tokens in line
            String word = lineScan.next();
            wordCount++;
        }
    }
    
    System.out.println("Total lines = " + lineCount);
    System.out.println("Total words = " + wordCount);
    System.out.printf("Average words per line = %.3f\n", (double) wordCount / lineCount);
}

Exercise : Debug ZipCode Case Study

In this exercise we will practice the jGRASP debugger using the Case Study example from the end of Chapter 6. To download this example, follow these steps:

  1. Go to the chapter 6 supplements for the textbook.
  2. Download and save the files ZipLookup.java and zipcode.txt. Right-click the file names and choose the option to save the link in whatever folder you have been using for lab work. Make sure to save them in the same folder.
  3. Compile and run ZipLookup.java in jGRASP. You might try using your own ZIP code and a relatively small radius like 0.5 miles. The program takes a while to run because it has to search a large data file.

continued on the next slide...

Exercise - jGRASP Debugger

continued on the next slide...

Exercise - jGRASP Debugger

We'll debug the program as it searches for all zip codes with 0.3 miles of the White House, at zip code 20500. Run the program with those input values:
What zip code are you interested in? 20500
And what proximity (in miles)? 0.3

20500: Washington, DC
zip codes within 0.3 miles:
    20045 Washington, DC, 0.26 miles
    20500 Washington, DC, 0.00 miles
    20501 Washington, DC, 0.27 miles
    20502 Washington, DC, 0.27 miles
Set a break point on the while loop itself. Then enter the the values of lat1 and long1 (latitude and longitude of the White House ZIP code).
lat1
38.894781
long1
-77.036122

continued on the next slide...

Exercise - jGRASP Debugger

Clear your previous break point and set a new break point inside on the printf inside the if. Then hit the resume button that looks like a play button and fill in the table below with the values for zip, lat2, and long2.

zip lat2 long2
20045 38.896599 -77.0319
20500 38.894781 -77.036122
20501 38.89872 -77.036198
20502 38.89872 -77.036198

Exercise : runningSum practice-it

Write a static method called runningSum that accepts as a parameter a Scanner holding a sequence of real numbers and that outputs the running sum of the numbers followed by the maximum running sum. For example if the Scanner contains the following data:

3.25 4.5 -8.25 7.25 3.5 4.25 -6.5 5.25

Your method should produce the following output:

running sum = 3.25 7.75 -0.5 6.75 10.25 14.5 8.0 13.25
max sum = 14.5

Click on the check-mark above to try out your solution in Practice-it!

Exercise : printDuplicates practice-it

Write a method printDuplicates that accepts a Scanner for an input file. Examine each line for consecutive occurrences of the same token on the same line and print each duplicated token along how many times it appears consecutively. For example the file:

hello how how are you you you you
I I I am Jack's Jack's smirking smirking smirking smirking smirking revenge
one fish two fish red fish blue fish
   bow  wow wow yippee yippee   yo yippee   yippee yay  yay yay

leads to the following console output:

how*2 you*4
I*3 Jack's*2 smirking*5

wow*2 yippee*2 yippee*2 yay*3

Exercise : mostCommonNames practice-it

Write a method mostCommonNames that accepts a Scanner for an input file with names on each line separated by spaces. Some names appear multiple times, if they do, they are listed consecutively. For example:

Benson Eric   Eric  Marty Kim  Kim Kim   Jenny  Nancy Nancy  Nancy  Paul  Paul
Stuart Stuart Stuart Ethan Alyssa Alyssa Helene Jessica Jessica Jessica Jessica
Jared  Alisa Yuki   Catriona  Cody   Coral   Trent Kevin  Ben Stefanie Kenneth

For each line, print the most commonly occurring name. If there's a tie, use the first name that had that many occurrences.

Most common: Kim
Most common: Jessica
Most common: Jared

Also return the total number of unique names in the whole file (e.g. 23 for the above input).

Exercise : frequentFlier practice-it

Write a method frequentFlier that accepts a Scanner for an input file of ticket type / mileage pairs and reports how many frequent-flier miles the person earned.

For example, given the input below, your method should return 15600 (2*5000 + 1500 + 100 + 2*2000).

firstclass  5000 coach   1500  coach
100 firstclass 2000  discount 300

If you finish them all...

If you finish all the exercises, try out our Practice-It web tool. It lets you solve Java problems from our Building Java Programs textbook.

You can view an exercise, type a solution, and submit it to see if you have solved it correctly.

Choose some problems from the book and try to solve them!