Lab 13: Birthday Visualization

Due Date: Must be checked off OR submitted to Canvas before the end of the day on 2/20/2018.


Goals

  • Additional practice with arrays.
  • Learn how to detect a mouse position in a grid.
  • Create a visual representation of data, meaning "information that has been abstracted in some schematic form." (citation)

Setup

The data for this lab can be found in the following three CSV ("comma separated values") files:

Once you have created a new .pde file for this assignment, make sure to save a copy of these files into the same folder.

Note: if you find later that you have an interest in doing a data-related final project, you can find some interesting data sets at this website. Make sure to talk to a TA about importing the data into Processing.


Birthday Popularity Visualization

birthday popularity visualization We will create a visual representation of the popularity of birth dates, that is, what is the ranking of each date of the calendar year, based on number of actual births on that day (irrespective of year).

We will then augment our visualization by displaying the actual ranking number when the user hovers the mouse over an individual day.

Step 0: Data Storage

Declare three int arrays at the top of the program named month, day, and rank. There are 366 days possible dates in the Gregorian calendar (February 29 is not always there, but is still some people's birthday!). Each array will contain an entry for every possible date (i.e. each array has 366 elements):

  • month[i] returns the month (Jan = 1, Dec = 12) of that date
  • day[i] returns the day of the month (1-31) of that date
  • rank[i] returns the frequency ordering (most common birthday = 1, least common birthday = 366) of birthdays on that date

Now we need to populate these arrays with our data.

Step 1: Importing From Files

Take a peek inside the CSV files using a text editor (not Excel). There's nothing special there – just a bunch of numbers separated by commas. Imagine doing this assignment by hand! Computers can really help us organize and process the large amounts of data that we now regularly produce.

Declare a variable String[] temp at the top of your program. This will be used to hold the raw data that you import from the files. The code below should go in setup() and will read the month data from the CSV file into your array:

temp = loadStrings("month.csv");  // read file data into array of Strings
month = int(split(temp[0],','));  // convert 1st line of file (temp[0]) into integer array

Add code to similarly read the day and rank data from files.

Check the numbers to verify, remembering that arrays are indexed starting from 0:

  • month[1] = 1, day[1] = 2, and rank[1] = 362, meaning that Jan. 2 is one of the least common birthdays.
  • Valentine's Day (Feb. 14) is index i = 31+14-1 = 44, so month[44] = 2, day[44] = 14, and rank[44] = 103.
birthday date grid

Step 2: Date Grid

Write a function that displays the (roughly) 12x31 grid of rectangles seen on the right that corresponds to the days of the year, one column for each month. The image shows rectangles of size 38x18, but feel free to customize this.

It is tempting to generate a full 12x31 grid using nested for-loops and we actually encourage you to do so initially to verify your rectangle generation calculation.

However, it's not actually a full 12x31 grid! Think carefully about how to create this grid using a single for-loop (hint: it will involve month[] and day[]).

Step 3: Encode Frequency Information as Colors

We will use a color range to encode the frequency information so that the data is more easily understandable. We create a key at the bottom by drawing a gradient as 160 vertical lines, using something similar to the following code:

for (int i = 0; i < 160; i++) {
   stroke(252-i, 247-i, 197-i);      // set color going from light (low i) to dark (high i) from left-to-right
   line(200+i, 700, 200+i, 700+30);  // draw vertical line
}

Feel free to customize as you please:

  • The gradient is currently positioned with the upper-left corner at position (200,700).
  • The gradient currently has height 30.
  • The gradient goes from color(252,247,197) (light) to color(252-159,247-159,197-159) = color(93,88,38) (dark).

Modify your grid-making function to fill the rectangles of the date grid with color based on the rank value by calling fill() before the call to rect(). Because the rank covers a range of 1 through 366 while the color range is only 0 through 159, we need to find where in the spectrum the rank falls. So we multiply the rank value by 160 and then divide by 366 to "condense" the rank range. Note: for reasons we won't get into here, make absolutely sure that you do the multiplication first before the division.

If you are not convinced that this work, try the top (1), middle (183), and bottom (366) ranks to see where they fall along the color range. Also make sure that the coloring matches your gradient (Jan. 1 is uncommon).

Step 4: Text Labeling

We need to label the rows, columns, and gradient bar so that others can understand your infographic! The code snippets below will help you, but note that you will need to replace the ellipsis (...) to complete the statements as well as change the positioning to match your display. Note: it may help to explicitly set the fill() for your text.

Label the column headings all at once:

text("Jan     Feb     Mar    ...", 90, 65);

Label the rows using a loop:

text(i+1, 65, 82 ...);

Label the ends of the gradient with text similar to "Least Common" and "Most Common".

Step 5: User Hover Display

birthday rank hover display We want to print out the ranking data when the user hovers the mouse over a rectangle, but this is probably too complex to safely program all at once. We will build up to the full functionality, testing different parts along the way.

display_values() function

Write a function named display_values() that is called from draw(). This function should declare two integers m_index (for month) and d_index (for day) as the first two lines inside of the function (this is different to our usual rule of putting variable declarations at the top of the program).

Based on the mouse position, these two variables should store the month (1-12) and day (1-31) indices of the box the user is hovering over. This calculation will involve: the starting point/corner of your grid of rects, the length and height of the rects, and any space between those rects.

It will likely help to test that your code is working by using the lines below after you define your variables. Make sure the numbers that it is printing matches where your mouse is on the screen.

Note: it is fine if it give bogus indices for non-existent days such as February 30th.

println("month index: " + m_index);
println("day index: " + d_index);

As you move the mouse to the edge of the drawing canvas, you will notice that some of the values are nonsensical, like -1 and 33. Because we only want values that are in the table range, enclose the println() calls in an if-statement verifying that m_index and d_index are "in range".

Finding the Rank

To print out the rank of a birthday, we need to know where it is in our list of 366 items in rank[]. Our problem is that each month has a different number of days, so we cannot just multiply it out like we did the pixel position in the Color Checker assignment.

We solve the problem by creating a new array of 12 items that holds the index of the beginning of each month:

int[] dayTotal = {0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366};

Use this new array to calculate the correct index to access rank[] with. Don't forget to account for the zero-indexing of arrays. It is recommended that you verify your calculation using a println() statement.

Display the Rank

If your index calculation is correct, then the last thing to do is to show the rank on the drawing canvas at the tip of the mouse:

text(rank[ ... ], mouseX, mouseY);

Make sure to replace the ellipses with your index calculation.

Contrasting Text

Because the rect fill color ranges from light to dark, using the same text color while hovering will be difficult to read for some dates. Set the fill() for the text using an if-statement whether the user is hovering over a "light-enough" rank or a "dark-enough" rank.

Example Solution

Bday Visualization gif A working solution is shown on the right.

This visualization has its color gradient in shades of blue. You can see the ranking numbers appear on the mouse tip as the mouse hovers over different squares. The hover text is black for the lighter squares and white for the darker squares.


Checkoff

  1. Run your program for your TA to see.
    • Displays an uneven grid of rectangles with month and day axes properly labeled.
    • The grid is shaded based on popularity rankings and matches the labeled gradient key at the bottom.
    • Hovering over a grid space with the mouse displays the ranking number, which will change color based on the color of the grid space to be visible.
  2. Show your code to your TA.
    • The month[], day[], and rank[] array values are correctly imported from CSV files.
    • Important lines or blocks of code should be commented.
    • There should be a block comment at the top describing what the program does and should include the name of your group members.
    • All functions, loops, and code blocks should be properly indented.