Geospatial Data Practice¶

In this section, we will practice manipulating and plotting geospatial data.

In [2]:
import geopandas as gpd
import matplotlib.pyplot as plt

Our Dataset: Countries!¶

Run the cell below to see the countries GeoDataFrame from yesterday's lecture, which you'll be working with today.

In [3]:
countries = gpd.read_file("ne_110m_admin_0_countries.shp")

Group Activity¶

Write a function called highlight_population that takes a countries GeoDataFrame and a continent name as input and returns a plot that colors the specified continent based on its population. Instead of plotting raw population numbers, the color should represent the continent's population as a percentage of the global population. To do this, you should add a new column to the dataset called pop_ratio.

The plot should show all countries outside of the continent as grey (color being #EEEEEE and edgecolor #FFFFFF). The plot should also include a legend. The legend should be scaled so the minimum value is 0 (vmin=0) and the maximum value is 1 (vmax=1). Finally, make sure the figsize is set to figsize=(15, 10).

In [4]:
def highlight_population(countries, continent):
    """Given a GeoDataFrame representing world data and a string continent name,
    returns a plot that colors the inputted continent
    as a ratio of gloabal population"""

    # calculating global population
    total_pop = countries["POP_EST"].sum()

    # dissolving on continent
    # NOTE: need to filter BEFORE dissolving because there
    # is categorical data that can't be averaged

    countries_subset = countries[["geometry", "CONTINENT", "POP_EST"]]
    countries_subset = countries_subset.dissolve("CONTINENT", aggfunc="sum")

    # data manipulation
    countries_subset = countries_subset.loc[slice(continent)]
    countries_subset["pop_ratio"] = countries_subset["POP_EST"] / total_pop

    # plotting
    fig, ax = plt.subplots(1, figsize=(15,10))
    countries.plot(ax=ax, color="#EEEEEE", edgecolor="#FFFFFF")
    countries_subset.plot(ax=ax, column="pop_ratio", legend=True, vmin=0, vmax=1)

    return ax

highlight_population(countries, "Africa")
Out[4]:
<Axes: >
No description has been provided for this image

Write a function called gdp_and_population_ratio that takes a countries GeoDataFrame as input and returns an Axes object with two subplots. The first subplot should color each continent based on its percentage of the world's population, while the second should color each continent based on its percentage of the world’s GDP. To achieve this, you may add new columns to the dataset called pop_ratio and gdp_ratio.

The plot should also include a legend. The legend should be scaled so the minimum value is 0 (vmin=0) and the maximum value is 1 (vmax=1). Finally, make sure the figsize is set to figsize=(15, 10)!

HINT: In order to find which columns you might want to use, you can use the list(countries.columns) properties to inspect what columns are in the dataset.

In [5]:
def gdp_and_population_ratio(countries):
    """Given a GeoDataFrame representing world data,
    returns a two figure plot that shows world GDP and
    population ratios"""

    # data manipulation
    total_pop = countries["POP_EST"].sum()
    total_gdp = countries["GDP_MD"].sum()
    countries["pop_ratio"] = countries["POP_EST"] / total_pop
    countries["gdp_ratio"] = countries["GDP_MD"] / total_gdp

    # plotting
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15,10))
    countries.plot(ax=ax1, column="pop_ratio", legend=True, vmin=0, vmax=1)
    countries.plot(ax=ax2, column="gdp_ratio", legend=True, vmin=0, vmax=1)
    return ax1, ax2


gdp_and_population_ratio(countries)
Out[5]:
(<Axes: >, <Axes: >)
No description has been provided for this image
In [ ]: