handout #20

CSE142—Computer Programming I

Programming Assignment #6

due: Tuesday, 11/15/05, 2 pm

Thanks to Nick Parlante of Stanford University for designing the original version
of this “nifty” assignment

This assignment will give you practice with file processing in conjunction with all of the other constructs we have learned this quarter (loops, if/else, methods, and graphics).  We will be processing a file with data obtained from the Social Security Administration.  They provide a web site showing the distribution of names chosen for children over the last 100 years in the US (http://www.ssa.gov/OACT/babynames/).

Every 10 years, the data gives the 1000 most popular boy and girl names for kids born in the US. The data can be boiled down to a single text file as shown below. On each line we have the name, followed by the rank of that name in 1900, 1910, 1920, ..., 2000 (11 numbers). A rank of 1 was the most popular name that year, while a rank of 997 was not very popular. A 0 means the name did not appear in the top 1000 that year at all.  The lines are in alphabetical order, although we will not depend on that.


Sam 58 69 99 131 168 236 278 380 467 408 466

Samantha 0 0 0 0 0 0 272 107 26 5 7

Samara 0 0 0 0 0 0 0 0 0 0 886

Samir 0 0 0 0 0 0 0 0 920 0 798

Sammie 537 545 351 325 333 396 565 772 930 0 0

Sammy 0 887 544 299 202 262 321 395 575 639 755

Samson 0 0 0 0 0 0 0 0 0 0 915

Samuel 31 41 46 60 61 71 83 61 52 35 28

Sandi 0 0 0 0 704 864 621 695 0 0 0

Sandra 0 942 606 50 6 12 11 39 94 168 257


We see that “Sam” was #58 in 1900 and is slowly moving down. “Samantha” popped on the scene in 1960 and is moving up strong to #7. “Samir” barely appears in 1980, but by 2000 is up to #798. The database is for children born in the US, so ethnic trends show up when immigrants have kids.

Your program is to give an introduction and then prompt the user for a name to display.  Then it will read through the data file searching for that name.  If it finds it, it should graph the data for that name.  If not, it should generate a short message indicating that the name is not found.  Look at the sample log of execution at the end of this write-up.  You are to exactly reproduce this format.

If the name is found, you are to construct a DrawingPanel to graph the data.  Don’t construct this object until you verify that the name is in the file.  If the name is not found, you shouldn’t construct a DrawingPanel at all.  For the cases when the name is found, you should construct the panel and then give a series of drawing commands that produce output like the following.  You are to exactly reproduce this output.

The background color is the standard white (in other words, you don’t have to set the background color).  There are a series of horizontal and vertical lines drawn in black.  The overall height of the panel should always be 550 pixels.  The horizontal line at the top should be 25 pixels below the top and the horizontal line at the bottom should be 25 pixels above the bottom.  The panel is divided into 11 sections of equal width to represent the 11 different decades for which we have data (starting with 1900 through 2000).  These particular sections have a horizontal width of 70 pixels each.  You should introduce class constants for the number of decades (11), the starting year (1900) and the horizontal width per decade (70).  It should be possible to change these values and have your program adapt appropriately (more on this later).  Notice that you must draw vertical lines to separate the 11 sections.

Notice that at the bottom of the panel you need to label each decade.  To do so, you will need a new drawing command.  The Graphics class includes a drawString method that takes 3 parameters: a String, an x-coordinate and a y-coordinate.  The (x, y) coordinate is the position of the lower-left corner of the text.  For example, the text “1900” above has coordinates (0, 550) while the text “1910” has the coordinates (70, 550).  You will obviously want to write a loop to draw these various labels, especially since your program has to adapt properly if the constants are changed.  At some point you will be faced with the problem of turning an int into a String (e.g., how do you turn an int value like 1900 into the String “1900”?).  There are two ways to do this.  You can call the method String.valueOf passing it the int and it will return a corresponding String or you can concatenate the int with an empty string as in:

"" + 1900

Then you’ll need to plot the actual data for the individual name.  As noted earlier, the panel will always be 550 pixels high with the upper and lower 25 pixels not part of the plot area.  That leaves exactly 500 pixels for plotting these values.  The numbers range from 1 to 1000, so each pixel will represent two different rankings.  Thus, a rank of 1 or 2 should be drawn at a y-coordinate of 25.  A rank of 3 or 4 should be drawn at a y-coordinate of 26.  A rank of 5 or 6 should be drawn at a y-coordinate of 27.  And so on up to ranks of 999 and 1000 which should be drawn at a y-coordinate of 524.  A rank of 0 (which means the name didn’t appear in the top 1000 at all) should be drawn at the very bottom of the plot range, at a y-coordinate of 525.

You are to draw lines connecting the different decades.  In addition, just to the right of the line you are to include a String that has the name followed by a space followed by the rank.  Notice, for example, that the String “Sam 58” appears to the right of the line for 1900.  That is because Sam had rank 58 in 1900.  The text is to appear just to the right and just above the point you are plotting.  In other words, you can use the same coordinates for drawString that you use for drawLine.  The lines and text for the actual plot should be drawn in red.

You should ignore case when comparing the name typed by the user with the names in the input file.  The Strings that you display on the drawing panel should use the name from the input file, even if the user types the name in a different case.  For example, if the user asks you to search for “SAM”, you should find it even though the input file has it as “Sam” and the drawing panel should use the input file’s “Sam” rather than what the user typed when it displays the name/rank information in the plot.

You will be using several different Scanner objects for this program.  You will have one Scanner that you use to read information from the console.  You will use a different Scanner to read from the input file.  And because the input file is line-based, you should construct a different Scanner object for each line of the input file, as in handout #19 and as explained in section 6.3 of the book.  You should write your code in such a way that you stop consuming lines of input once you find one that has the name you’re searching for.

In terms of style points, we will be grading on your appropriate use of control structures like loops and if/else statements, your ability to avoid redundancy and your ability to break down the overall problem into methods that each solve part of the overall problem.  No one method should be overly long.  You should be able to come up with at least four different methods other than main that each perform some nontrivial part of the problem.

You will need to have DrawingPanel.java in the same directory as your program for it to compile properly.  In addition, you will need to have the file names.txt in the same directory as your program for Java to find it.  If you are using DrJava, you will have to put the names.txt file in the same folder as the DrJava program or you will have to use a fully-qualified file name as described in section 6.2.2 of the book.

Nick suggests the following questions to explore and recommends the original article that gave him the idea (http://www.farfilm.com/peggy/articles/wherehaveallthelisas.htm):

·        Why is Rock popular in 1950 and Trinity in 2000?

·        Type in your grandparents names. Names like Ethel and Mildred and Clarence sound old fashioned – and they are! But wait long enough and the come back – Emma! Hannah!

·        Michael is very popular. To see the growing Spanish speaking influence in the US, look at Miguel. For a more recent immigration, look at Muhammad and Samir.

·        Apparently Biblical old-testament names came back in the 1970's. A reaction to the 1960's maybe? Try Rachel and Rebecca. The pattern seems to generalize – Sarah, Abraham, Adam.  Eve and Moses are out of luck for some reason though.

·        Try historical names like Sigmund or Adolf. I was thinking Adolf would vanish in the mid 30's but it seems to vanish 10 years before that. In any case, Adolf is tricky, since there are a bunch of variant names like Adolph.

As noted earlier, your program has to have three constants for the number of decades, the start year and the horizontal width to use for each decade.  You can make sure your program works properly by changing the number of decades to 9, the start year to 1920 and the horizontal width to 90 and changing the file name to names2.txt (which has just 9 decades worth of data).  Below is a sample of the output for “Ethel”.

You can introduce other class constants if you want to in addition to the three required constants.

Your program should be stored in a file called Names.java.  All of the files that you will need are stored in a file called ass6.zip on the class web page.  These files should be put in the same folder as your program (DrJava users, see the note above about names.txt and names2.txt).

Log of execution (user input underlined)

This program allows you to search through the

data from the Social Security Administration

to see how popular a particular name has been

since 1920.


name? kumar

name not found