In this section, we will perform some various data analyses on the combined dataset you created in Part 0.
math
, matplotlib.pyplot
, geopandas
, and pandas
packages, but you may not use any other imports.For each of the functions below, they should be written in hw5_main.py
and each one should take the merged data from Part 0 as a parameter.
percentage_food_data
Write a function called percentage_food_data
that returns the percentage of census tracts in Washington that we have food access data for. The returned number should be a float
between 0 and 100 (e.g. 73.212456). Note that the example shown in the last sentence is not necessarily the expected output, it is just meant to show you what the returned number should look like.
plot_map
Write a function called plot_map
that plots a map of Washington. There is no need to customize this plot or add any data on top of it; it should just plot the shape of all the census tracts. The output should look like Washington state. You should save the plot in a file called washington_map.png
.
plot_population_map
Write a function called plot_population_map
that plots a map of Washington with each census tract colored by its population. It is expected that there will be some missing census tracts. You should also include a legend to indicate what the colors mean. You should save the plot in a file called washington_population_map.png
.
plot_population_county_map
Write a function called plot_population_county_map
that plots a map of Washington with each county colored by its population. You'll need to aggregate the census tract data to be for each county instead. It is expected that there will be some missing counties. You should also include a legend to indicate what the colors mean. You should save the plot in a file called washington_county_population_map.png
.
plot_food_access_by_county
For this problem, you will be writing a function called plot_food_access_by_county
that takes the merged data as a parameter and makes various plots on the same figure showing information about food access and low income. This problem is more complicated than the others so we will provide a breakdown of the steps needed to solve it (some with provided code). Here is the final result that you should produce.
GeoDataFrame
that only has the columns 'County'
, 'geometry'
, 'POP2010'
, 'lapophalf'
, 'lapop10'
, 'lalowihalf'
, 'lalowi10'
.Compute columns named 'lapophalf_ratio'
, 'lapop10_ratio'
, 'lalowihalf_ratio'
, 'lalowi10_ratio'
that store the ratio of people in that county that fall under each group respectively. These columns should be added to the local copy of the dataset.
fig, [[ax1, ax2], [ax3, ax4]] = plt.subplots(2, figsize=(20, 10), ncols=2)
This line of code looks complicated, but all you need to know is the variable fig
stores a reference to the whole figure (i.e. the picture) and each of the variables that start with ax
store a reference to one of sub-plot's axis.
plot
function on the dataset and changing the color by specifying the column you want. As before each plot should have legend. You'll need to specify the ax
parameter and pass in the axis from the previous step to have it draw in the proper place. To keep things consistent, you should also specify vmin
and vmax
to be 0 and 1 respectively so they all use the same scale.ax1.set_title('Foo')
fig.savefig('washington_county_food_access.png')
If these steps are done correctly, you should end up like the picture shown above.
Development Strategy: It might help to start by making these on separate plots and then figuring out how to plot them on the same figure.
plot_low_access_tracts
In this problem, we will plot all of the census tracts that are considered low access. You should write a function called plot_low_access_tracts
that saves the information described below in a file named washington_low_access.png
. The definition for low access depends on whether or not the census tract is "urban". The data is set up so that each census tract is either "urban" or "rural".
In this problem, you should compute all of the census tracts that match the definition above (depending on if it is urban or not). We will then make a plot in layers (all on the same axis) to highlight the census tracts that have low food access. Because we are plotting on the same set of axes, a new plot will "draw over" the old one which will allow us to highlight exactly as we want. You should plot the data in the following order.
color='#EEEEEE'
when plotting to make the census tracts a light gray.color='#AAAAAA'
when plotting to make these census tracts a dark gray.For this problem, you are NOT allowed to use the 'LATracts_half'
or 'LATracts10'
columns since we are trying to compute these. As a sanity check, you can verify that census tracts you computed match the ones indicated with these columns, but you should not leave this sanity check code when you submit.