Pandas Practice Continued & Data Visualization¶
In this lession, we will practice using groupby
and data visualization tools learned this week.
import pandas as pd
import seaborn as sns
Our Dataset!¶
Run the cell below to see the DataFrame you'll be working with today.
scoreboard = pd.DataFrame({
"Player": ["Arona", "Hannah", "Renusree", "Arpan", "Mia", "Asmi", "Alyssa", "Vani", "Vatsal",
"Jasmine", "Kevin", "Alessia"],
"FavoriteTrack": ["Bowser’s Castle", "Rainbow Road", "Toad Harbor", "Big Blue", "Cheese Land",
"Toad Harbor", "Mario Circuit", "Bowser’s Castle", "Mario Circuit",
"Coconut Mall", "Mario Circuit", "Cheese Land"],
"Coins": [9, 8, 8, 9, 9, 10, 9, 7, 9, 8, 9, 10],
"Mushrooms": [2, 0, 3, 1, 2, 2, 0, 3, 3, 1, 2, 3],
"TopSpeed": [150, 70, 60, 125, 30, 20, 80, 94, 10, 77, 23, 49],
"Character": ["Monty Mole", "Yoshi", "Luigi", "Blue Toad", "Toadette", "Princess Peach",
"Princess Daisy", "Waluigi", "King Boo", "Bowser", "Mario", "Wario"],
"Drivetrain": ["Bike", "Car", "4 wheeler", "Car", "Stroller", "4 wheeler", "Car", "Bike",
"Stroller", "4 wheeler", "Bike", "Bike"],
"Playstyle": ["Aggressive", "Aggressive", "Resourceful", "Speedster", "Resourceful", "Resourceful",
"Balanced", "Aggressive", "Balanced", "Balanced", "Balanced", "Resourceful"]
})
scoreboard
Player | FavoriteTrack | Coins | Mushrooms | TopSpeed | Character | Drivetrain | Playstyle | |
---|---|---|---|---|---|---|---|---|
0 | Arona | Bowser’s Castle | 9 | 2 | 150 | Monty Mole | Bike | Aggressive |
1 | Hannah | Rainbow Road | 8 | 0 | 70 | Yoshi | Car | Aggressive |
2 | Renusree | Toad Harbor | 8 | 3 | 60 | Luigi | 4 wheeler | Resourceful |
3 | Arpan | Big Blue | 9 | 1 | 125 | Blue Toad | Car | Speedster |
4 | Mia | Cheese Land | 9 | 2 | 30 | Toadette | Stroller | Resourceful |
5 | Asmi | Toad Harbor | 10 | 2 | 20 | Princess Peach | 4 wheeler | Resourceful |
6 | Alyssa | Mario Circuit | 9 | 0 | 80 | Princess Daisy | Car | Balanced |
7 | Vani | Bowser’s Castle | 7 | 3 | 94 | Waluigi | Bike | Aggressive |
8 | Vatsal | Mario Circuit | 9 | 3 | 10 | King Boo | Stroller | Balanced |
9 | Jasmine | Coconut Mall | 8 | 1 | 77 | Bowser | 4 wheeler | Balanced |
10 | Kevin | Mario Circuit | 9 | 2 | 23 | Mario | Bike | Balanced |
11 | Alessia | Cheese Land | 10 | 3 | 49 | Wario | Bike | Resourceful |
Group Activity¶
Find the Player
s with the most Coins
in each Drivetrain
.
# TODO: Who has the most coins per Drivetrain group?
Count how many Player
s in each Playstyle
category like each FavoriteTrack
.
# TODO: Which Playstyle likes Cheese Land the most?
Create both a line plot and a scatter plot to visualize TopSpeed
trends by Playstyle
. Compare the effectiveness of each in identifying patterns or outliers.
# TODO: Visualize!
What if we wanted to set the index of our DataFrame to be 2 columns? Set the indices to Drivetrain
and FavoriteTrack
and then find all the Bike
rs who like Bowser's Castle
.
# TODO: Which Bikers like Bowser's?
Whole Class Activity¶
Write a function players_above_average
that calculates the average Coins
for each Playstyle
, then lists the
Player
s whose Coins
are above the average of the inputted Playstyle
.
# Run this cell once before writing your function!
scoreboard.reset_index(inplace=True)
# TODO: Identify standout TAs!
def players_above_average(data, playstyle):
...
players_above_average(scoreboard, "Resourceful")