Exercise: Visualizing Uncertainty
In this exercise, you will design and assess visualizations of uncertain data.
Task 1: Experimental Results
Given experimental data, how might you best convey the results? Are there meaningful differences between conditions?
You’ve been asked to help analyze measurements of birds. Scientists have measured the length and depth of bird beaks on a number islands. Based on other physical characteristics of the birds, the data have been grouped into two conditions (A
and B
).
const birds = FileAttachment("../data/birds.json").json();
For your first analysis question, you’ve been asked to assess if the beak_length
varies significantly across conditions. Create two visualization to compare beak_length
by condition
: one focused on the mean length, and another focused on the distribution of values.
Convey the mean (average) lengths
Plot the average beak_length
for each condition
. Also include a measure of spread, such as the standard error of the mean or the interquartile range.
What measure of spread did you choose, and why?
- Write your answer here
Convey the distribution
Now instead create a visualization intended to better convey the overall distribution of beak_length
measurements, again divided by condition
. Examples might include plotting raw values, histograms, or density plots.
Based on the charts you’ve created, how would you describe the difference between condition A and condition B?
- Write your answer here
Task 2: Assessing Correlation
Your next task is to assess the relationship between bill_length
and bill_depth
, again grouped by condition
. Below is a scatter plot of the two variables, overlaid with a linear regression model fit to all of the data. Modify the chart to include an additional layer that shows regression fits per condition, with each regression line color coded by condition
.
What do the different regression models (overall vs. subdivided) convey? Describe the trends and try to make sense of any potential contradictions you find.
- Write your answer here
Task 3: Election Uncertainty
Two politicians are locked in a tight race and vote counting is underway. How might you communicate the uncertainty and/or likely outcome of the race?
Mickey Mouse and Donald Duck are locked in a tight election to become mayor of Disneyland. With exactly 80% of the total vote counted, the state of the race is:
Candidate | Votes |
---|---|
Mickey Mouse | 18,073 |
Donald Duck | 17,847 |
The votes counted so far have come from all parts of Disneyland, and there is no evidence of any bias in the current counts relative to Disneyland’s population.
Create a visualization that conveys not just the state of the race, but its associated uncertainty. You are free to apply statistical methods to the data (in which case you may wish to search for appropriate methods online), but you may also consider forms of uncertainty visualization that do not require statistical modeling.
// put code or image here
In addition to your visualization, address the following questions:
What aspect of uncertainty are you attempting to convey with your image?
- Write your answer here
How well do you believe your image achieves this goal? Why?
- Write your answer here
Don’t forget to add
, commit
, and push
your exercises to your GitLab repo!