handout #11
CSE142—Computer
Programming I
Programming
Assignment #6
due: Wednesday, 8/2/17,
11 pm
Thanks to Nick Parlante from Stanford for the
assignment concept.
This assignment focuses on reading input files. Turn in a file named BabyNames.java.
You will need DrawingPanel.java, which you used on previous
assignments, to write this program.
The Social
Security Administration has published the 1000 most popular boy and girl names
for children born in the US for all years after 1879 (see http://www.ssa.gov/OACT/babynames/). For this project, you will prompt the user
for a name and gender, and then display the name’s meaning and popularity as console
text and as a graphical bar chart on a DrawingPanel. The input data about names' popularity
rankings and meanings comes from two input files provided on the course web
site.
Your program
should give an introduction and then prompt the user for a first name and
gender. It should then read the name
rank data file searching for that name/gender combination, case-insensitively
(that is, you should find the name regardless of the capitalization the user
uses when typing it). If the name/gender
combination is found in the file, your program should print a line of statistics
about that combination’s popularity in each decade, the name’s meaning and
display information about the name graphically.
This program allows you to search through the
data from the Social Security Administration
to see how popular a particular name has been
since 1890.
Name: michelle
Gender (M or F): f
Michelle F 0 0 0 0 0 728 173 39 4 10 22 52 125
MICHELLE f French, English French feminine form of MICHAEL
Your program reads data from two files. Download them from our web site to the same
folder as your program.
1. names.txt: popularity
rankings for each name 1890-2010
Each
line of names.txt contains a name followed by
that name's rank in 1890, 1900, 1910, etc.
The default file has 13 numbers/line, so the last represents the ranking
in 2010. Rank #1 was the most popular that year, while rank #999 was not
popular. Rank 0 means the name did not appear in the top 1000. For example:
Michelle F 0 0 0 0 0 728 173 39 4 10 22 52 125
Michelle M 0 0 0 0 0 0 0 0 736 897 0 0 0
Michial M 0 0 0 0 0 0 987 0 0 0 0 0 0
"Michelle"
as a female name first made the list in 1940 and peaked in 1970 at #4. It has been on a steady decline in popularity
since. "Michial"
only made the top-1000 in 1950. Notice
that the spacing between numbers and other tokens varies. This won’t be a problem if you use a Scanner
to pull apart each line of input (versus, for example, searching for a string
that has exactly one space between the name and the F/M gender notation).
Once
the user types a name/gender combination, search each line of names.txt to see if it contains data for
that combination. If it is found, output
its data line to the console, then construct a DrawingPanel to graph the data (see next
page). Your code should not assume that the file is sorted alphabetically.
If the
combination is not found, output a "not found" message. No DrawingPanel should appear.
This program allows you to search through the
data from the Social Security Administration
to see how popular a particular name has been
since 1890.
Name: zOIDberG
Gender (M or F): m
"zOIDberG" not found.
Though
the data shown above has 13 decades' worth of rankings, your program should
work properly with any number of decades of data (at least 1). Since there is a limit to the size of the DrawingPanel, you'd only be able to see data
from 13 decades, but your code should process as many decades of data as it finds in the line. Do not assume that there will be exactly 13
decades when writing this program.
On the course website is a file named names2.txt with 8 decades of data to help
you test this behavior.
2. meanings.txt: descriptions
of the meanings of each name
If the
name/gender combination is found in names.txt, you should also read meanings.txt to find its meaning. The line containing the combination’s meaning
should be printed to the console and also drawn on the DrawingPanel. Every combination in names.txt is also in meanings.txt, so you do not need to worry
about a combination having rankings but no meaning data. Some combinations have long meanings that may
stretch past the right edge of the DrawingPanel.
Each
line of meanings.txt contains a name in upper case,
followed by a gender and meaning, as in:
MICHELLE f French, English French feminine form of MICHAEL
MICHELYNE f English (Modern) Pet form of MICHELLE
MICHI f Japanese Means "pathway" in Japanese.
MICHIAL m (no meaning found)
Though
the two input files contain different data, the task of searching for a
name/gender combination in names.txt is very similar to the task of
searching for a combination in meanings.txt. Your code should take advantage of this fact
and should avoid redundancy. You will be using several different Scanner objects for this program. You
will have one Scanner that you use to
read information from the console. You will use a different Scanner to read from each
file. And because the input file is line-based, you should construct
a different Scanner object for each
line of the input file, as in section 6.3 of the book. You should
write your code in such a way that you stop consuming lines of input once you
find one that has the name you’re searching for.
The panel's overall size is 780x560
pixels. Its background is white. It has light gray (Color.LIGHT_GRAY)
filled rectangles along its top and bottom with a black line at their bottom
and top, respectively. The two
rectangles are each 30 pixels tall and span across the entire panel, leaving an
open area of 780x500 pixels in the middle.
The line of data about the name's meaning appears in the top gray
rectangle at (0, 16).
Each decade
is represented by a width of 60 pixels.
The bottom light gray rectangle contains black labels for each decade,
8px from the bottom of the DrawingPanel. For example, with default constant values
(see style guidelines), the text "1890" is at (0, 552) and
"1910" is at (120, 552).
Rank |
Top y |
1 |
30 |
2,
3 |
31 |
4,
5 |
32 |
6,
7 |
33 |
... |
... |
996,
997 |
528 |
998,
999 |
529 |
0 |
530 |
Starting at the same x-coordinate, a
bar shows the name ranking data for each year.
The bars are green. Bars are half
as wide as each decade (30px). The table
at right shows the mapping between rankings and y-values of the tops of
bars. Y-values start at 30 (below the
top gray rectangle), and there is a vertical scaling factor of 2 between pixels
and rankings; divide a ranking by 2 to get its y-coordinate.
At the same coordinate as the top-left
of each bar, black text shows the name's rank for that decade. For example, Michelle was #4 in 1970, so
"4" appears at (480, 32). A 0 rank means the name was not in the top
1000. No bar should appear in such a
case, and "0" should be drawn right above the bottom gray bar. For example, in the screenshot above,
Michelle's 0 in 1920 is drawn at (180, 530).
We suggest
you begin with the text output and file processing, then
any "fixed" graphical output, and then the bars. The 0-ranking case is particularly tricky to
draw, so you may want to do this last.
(Hint: Treat rank 0 as a rank of 1000.)
Your program should work correctly
regardless of the capitalization the user uses to type the name. If the user types "LiSa"
or "lisa",
you should find it even though the input files have it as "Lisa"
and "LISA".
Draw text labels on the DrawingPanel
using the drawString
method of the Graphics
object. To draw an int
as text, you can convert it into a String
using the +
operator with an empty string. For
example, for an int
variable named n
with value 100,
the expression "" + n yields the String
"100". To draw this at (50, 120), you could write:
g.drawString("" + n, 50, 120);
You
should have at least these three class
constants. If the constant values
are changed, your output should adapt.
·
The
starting year of the input data, as
an integer (default of 1890)
e.g. If you change the start year to 1825, the program
should assume the data comes from 1825, 1835, etc.
·
The
width of each decade on the DrawingPanel, as an integer (default of 60)
e.g.
If you change the
width to 50, each green decade bar is 50px apart and 25px thick.
·
The
height of the legend rectangles, as
an integer (default of 30)
e.g. If you change
the legend height to 20, the gray rectangles at top and bottom are 20px
tall. The panel is 540px tall so that
the open area in the middle is 500px tall.
The decade labels are at y=532.
We
will be especially picky about redundancy.
For full credit, your methods should obey these constraints:
·
The main method should not draw on a DrawingPanel, nor read lines of input from a
file (nextLine).
·
The
method that asks the user for a name must not also read lines of input from a
file.
·
Split
the displaying of graphical data into at least two methods. For example, you could have one method to
draw "fixed" graphics (gray legend rectangles, etc.) and another for
graphics that come from the file (bars, ranks).
Your methods should be well-structured and avoid redundancy, and your main
method should be a concise summary of the overall program. Avoid "chaining," which is when
many methods call each other without ever returning to main.
For this assignment you are limited to
the language features in Chapters 1 through 6 of the textbook. In particular, you are not allowed to use arrays on this assignment. Follow past stylistic guidelines about indentation,
line lengths, identifier names, and localizing variables, and commenting at the
beginning of your program, at the start of each method, and on complex sections
of code.