CSE 163, Winter 2020: Homework 5: Part 0

Overview

In this part of the homework, you will load and merge the data from the files. See the Overview for a description of the dataset files.

Expectations

  • For this part of the assignment, you may import and use the geopandas and pandas packages, but you may not use any other imports to solve this problem.
  • The first line of your file should be a comment with your uwnetid.
  • You should not use any loops to loop over the dataset in this part of the assignment.

Loading the Data

In hw5_main.py, write a function called load_in_data that takes two parameters, the file name of a shape file of Census Tract shapes and the file name of a CSV containing food access data. load_in_data should return a GeoDataFrame that has the two datasets merged together. For example, to call this method with the provided files:

load_in_data('tl_2010_53_tract00/tl_2010_53_tract00.shp', 'food-access.csv')

This function should join the data on the columns that indicate the census tract ID using the merge function. For the shape file, this column is called CTIDFP00, while it is called CensusTract in the CSV file. You may assume the given files reference datasets with the column names for merging. However, for flexibility, you should make no other assumptions about the other columns in the datasets for in this function. Even though the documentation linked above is for pandas, geopandas objects have the same method.

It is possible that there are census tracts in the shape file that do not have corresponding data for food access; these census tracts should still appear in the merged data but the values for the food access information should be "missing values".

For the provided datasets, your function should return a GeoDataFrame with 1318 rows and 30 columns. As a sanity check, our solution is 4 lines long.