This assignment and its reflection are due by Thursday, Feb 27 at 11:59 pm.
You should submit your finished
hw5_main.py
on Ed and the reflection on Google Forms.
In this assignment, you will do a bit of data analysis involving geospatial data in order to investigate food deserts in Washington state.
After this homework, students will be able to:
Here are some baseline expectations we expect you to meet:
Follow the course collaboration policies
hw5_main.py
that uses the main method pattern that calls every method you write using the provided dataset.You should download the starter code hw5.zip and open it as the project in Visual Studio Code. The files included are:
cse163_utils.py
: A file where we will store utility functions to help you write any tests you might want to write.
cse163_utils.py
in your hw5_main.py
to make sure the plotting works. However, this causes problems with flake8
because the import is technically unused. In this case, you are allowed to bypass flake8
by importing with this syntax: import cse163_utils # noqa: F401
tl_2010_53_tract00
: A directory containing all of the shapefile information. You will most likely only be working with file tl_2010_53_tract00/tl_2010_53_tract00.shp
inside this directory. The data is described below.food-access.csv
: CSV file containing information about food access. The data is described below.In this assignment, you will be working with two datasets.
This can be a lot to take in at first. Remember the goal here is to count how many people in a given census tract do not have easy access to food. For this dataset, we define "access" as being more than X miles from a food source. For urban areas, we want to look at the number of people more than half a mile away from their closest food source (lapophalf
) while for rural environments, we want to look at people more than 10 miles away from their closest food source (lapop10
). We will use these counts to determine if a census tract is low access as a whole to identify likely "food deserts".
The first dataset you will be using comes from the 2010 census. The information is stored in the tl_2010_53_tract00
directory, but you will most likely only be using the tl_2010_53_tract00/tl_2010_53_tract00.shp
file as the access point to this data. The shapefile is similar to a CSV in the sense that it has columns and rows, but it has special functionality for geo-spatial data. Each row of the dataset corresponds to one census tract. The data has many columns, but you only need to understand the following:
This dataset only has entries for census tracts in Washington state.
The second dataset stores information about food access in each of these census tracts. The file is stored as a CSV format that we have been using all quarter. Each row in the dataset corresponds to a census tract and has the following columns. The data has many columns, but you only need to understand the following:
LATracts_half
does.LATracts10
does.lapophalf
but only counts the people that are considered low access and low income.lapop10
but only counts the people that are considered low access and low income.This dataset has entries for the entire country.
Just like for HW4, to avoid having to duplicate the datasets, we will all use a shared location for the data. You can find the data files on Ed at the locations below. When submitting your assignment, you will need to use these path names so you can submit.
/course/food-access/tl_2010_53_tract00/tl_2010_53_tract00.shp
/course/food-access/food-access.csv
You can access a playground notebook here. We recommend trying this out and seeing how the dataset looks like and for prototyping your solutions!
Part 2a: Submit Assignment and Part 2b: Complete Reflection. On Ed, you should submit all the files listed below.
hw5_main.py
Your submission will be evaluated on the following dimensions
hw5_main.py
is a comment with your name and uwnetid. Your file should also have a comment at the top explaining what this file should be used for.flake8
.