Skip to article frontmatterSkip to article content

In this lesson, we will practice working with DataFrames.

import pandas as pd
import doctest

Our Dataset!

Run the cell below to see the DataFrame you’ll be working with today. It is a dataset about the competing bakers at the Great Seattle Bake Off.

gsb = pd.DataFrame({
    "BakerID": [101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111],
    "Name": ["Arona", "Anaya", "Thrisha", "Arpan", "Alexis", "Asmi", "Alyssa", "Sheamin", "Sonoma", "Laura", "Kevin"],
    "DessertBaked": ["red velvet cake", "basque cheesecake", "baumkuchen", "pound cake", "german chocolate cake", 
                     "victoria sponge cake", "birthday cake", "carrot cake", "matcha cake", "tres leches", "fruit cake"],
    "FlavorScore": [60, 92, 78, 40, 38, 73, 50, 59, 75, 99, 70],
    "PresentationScore": [95, 80, 88, 92, 98, 100, 98, 60, 77, 100, 100],
    "CityOfOrigin": ["Paris", "New York City", "Florence", "Seattle", "Seattle", "Paris", "New York City", "Seattle", 
                     "Paris", "New York City", "Florence"],
    "StartedBaking": [2006, 2019, 2010, 2020, 2020, 2015, 2019, 2013, 2018, 2020, 2006]
})

gsb

Warm Up

Display all the column names in the cell below.

# TODO: Display column names

Then, select the bakers who recieved more than 70 points on their cake’s flavor and display the result as a dataframe.

# TODO: Rows where the flavor score is > 70

Practice Problems: DataFrame Manipulation

Add a new column called CreativityScore with the values [100, 80, 70, 90, 100, 20, 80, 90, 100, 70, 70]

# TODO: Create CreativityScore column

Now, calculate the mean of the FlavorScore, PresentationScore, and CreativityScore column, and store that value under a new column, TotalScore.

# TODO: Calculate mean and create TotalScore

Next, change the index of the DataFrame to the BakerID and find what Baker 105’s CityOfOrigin is. What about their StartedBaking year?

# TODO: Update the index
# TODO: Who is Baker 105?

Practice Problems: Groupby

Find the BakerIDs with the highest FlavorScore in each CityOfOrigin.

# TODO: Who has the highest FlavorScore per CityOfOrigin group?

In each CityOfOrigin, count how many BakerIDs StartedBaking in each year.

# TODO: When did our TAs start baking?