Exercise: Visualizing Uncertainty

In this exercise, you will design and assess visualizations of uncertain data.


Task 1: Experimental Results

Given experimental data, how might you best convey the results? Are there meaningful differences between conditions?

You’ve been asked to help analyze measurements of birds. Scientists have measured the length and depth of bird beaks on a number islands. Based on other physical characteristics of the birds, the data have been grouped into two conditions (A and B).

const birds = FileAttachment("../data/birds.json").json();

For your first analysis question, you’ve been asked to assess if the beak_length varies significantly across conditions. Create two visualization to compare beak_length by condition: one focused on the mean length, and another focused on the distribution of values.

Convey the mean (average) lengths

Plot the average beak_length for each condition. Also include a measure of spread, such as the standard error of the mean or the interquartile range.

What measure of spread did you choose, and why?

Convey the distribution

Now instead create a visualization intended to better convey the overall distribution of beak_length measurements, again divided by condition. Examples might include plotting raw values, histograms, or density plots.

Based on the charts you’ve created, how would you describe the difference between condition A and condition B?

Task 2: Assessing Correlation

Your next task is to assess the relationship between bill_length and bill_depth, again grouped by condition. Below is a scatter plot of the two variables, overlaid with a linear regression model fit to all of the data. Modify the chart to include an additional layer that shows regression fits per condition, with each regression line color coded by condition.

What do the different regression models (overall vs. subdivided) convey? Describe the trends and try to make sense of any potential contradictions you find.

Task 3: Election Uncertainty

Two politicians are locked in a tight race and vote counting is underway. How might you communicate the uncertainty and/or likely outcome of the race?

Mickey Mouse and Donald Duck are locked in a tight election to become mayor of Disneyland. With exactly 80% of the total vote counted, the state of the race is:

Candidate Votes
Mickey Mouse 18,073
Donald Duck 17,847

The votes counted so far have come from all parts of Disneyland, and there is no evidence of any bias in the current counts relative to Disneyland’s population.

Create a visualization that conveys not just the state of the race, but its associated uncertainty. You are free to apply statistical methods to the data (in which case you may wish to search for appropriate methods online), but you may also consider forms of uncertainty visualization that do not require statistical modeling.

// put code or image here

In addition to your visualization, address the following questions:

What aspect of uncertainty are you attempting to convey with your image?

How well do you believe your image achieves this goal? Why?


Don’t forget to add, commit, and push your exercises to your GitLab repo!